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Abstract 

It is important that practical data-flow analyzers are backed by reliably proven theoretical 
results. Abstract interpretation provides a sound mathematical framework and necessary 
generic properties for an abstract domain to be well-defined and sound with respect to 
the concrete semantics. In logic programming, the abstract domain Sharing is a standard 
choice for sharing analysis for both practical work and further theoretical study. In spite 
of this, we found that there were no satisfactory proofs for the key properties of commuta- 
tivity and idempotence that are essential for Sharing to be well-defined and that published 
statements of the soundness of Sharing assume the occurs-check. This paper provides a 
generalization of the abstraction function for Sharing that can be applied to any language, 
with or without the occurs-check. Results for soundness, idempotence and commutativity 
for abstract unification using this abstraction function are proven. 

Keywords: Abstract Interpretation; Logic Programming; Occurs-Check; Rational 
Trees; Set-Sharing. 

1 Introduction 

In abstract interpretation, the concrete semantics of a program is approximated 
by an abstract semantics; that is, the concrete domain is replaced by an abstract 
domain and each elementary operation on the concrete domain is replaced by a 
corresponding abstract operation on the abstract domain. Assuming the global 
abstract procedure mimics the concrete execution procedure, each basic operation 
on the elements of the abstract domain must produce a safe approximation of the 
corresponding operation on corresponding elements of the concrete domain. For 
logic programming, the key elementary operation is unification that computes a 
solution to a set of equations. This solution can be represented by means of a 
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mapping (called a substitution) from variables to first-order terms in the language. 
For global soundness of the abstract semantics, there needs to be, therefore, a 
corresponding abstract operation, aunify, that is sound with respect to unification. 

For parallelization and several other program optimizations, it is important to 
know before execution which variables may be bound to terms that share a com- 
mon variable. Jacobs and Langen developed the abstract domain Sharing (Jacobs 
and Langen 1989, Jacobs and Langen 1992) for representing and propagating the 
sharing behavior of variables and this is now a standard choice for sharing analysis. 
Subsequent research then concentrated mainly on extending the domain to incor- 
porate additional properties such as linearity, freeness and depth-fc abstractions 
(Langen 1990, Bniynooghc and Codish 1993, Codish, Dams, File and Bruynooghe 
1996, King 1994, King and Soper 1994, Muthukumar and Hermenegildo 1992) or in 
reducing its complexity (Bagnara, Hill and Zaffanella 1997, Bagnara, Hill and Zaf- 
fanella 2001). Key properties such as commutativity and soundness of this domain 
and its associated abstract operations such as abstract unification were normally 
assumed to hold. One reason for this was that (Jacobs and Langen 1992) includes a 
proof of the soundness and refers to the Ph.D. thesis of Langen (Langen 1990) for 
the proofs of commutativity and idempotence.^ We discuss below why these results 
are inadequate. 



1.1 Soundness of auniiy 

An important step in standard unification algorithms based on that of Robin- 
son (Robinson 1965) (such as the Martelli-Montanari algorithm (Martelli and Mon- 

tanari 1982)) is the occurs-check, which avoids the generation of infinite (or cyclic) 
data structures. With such algorithms, the resulting solution is both unique and 
idempotent. However, in computational terms, the occurs-check is expensive and the 
vast majority of Prolog implementations omit this test, although some Prolog im- 
plementations do offer unification with the occurs-check as a separate built-in predi- 
cate (in ISO Prolog (ISO/IEC 1995) the predicate is unif y_with_occurs_check/2). 
In addition, if the unification algorithm is based on the Martelli-Montanari algo- 
rithm but without the occurs-check step, then the resulting solution may be non- 
idempotent. Consider the following example. 

Suppose we are given as input the equation p(^z, f{x, y)) = p{f{z, y), z) with an 
initial substitution that is empty. We apply the steps in the Martelli-Montanari 



^ Even though the thesis of Langen has been published as a technical report of the University of 
Southern California, an extensive survey of the literature on Sharing indicates that the thesis 
has not been widely circulated even among researchers in the field. For instance, Langen is 
rarely credited as being the first person to integrate Sharing with linearity information, despite 
the fact that this is described in the thesis. 
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procedure but without the occurs-check: 
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P{z,f{x,y)) =p{f{z,y),z) 
z = f{z,y),.f{x,y) = z 
f{x,y) = f{z,y) 



{z ^ f{z,y)} 
{z H^. f{z,y)} 
{z H^. f{z,y),x\-^ z] 
[z H^. f{z,y),x>-^ z} 
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x = z,y = y 
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y = y 
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Then a = {z f{z,y),x z} is the computed substitution; it is not idempotent 
since, for example, xcr = z and xaa = f{z,y). 

Non-standard equality theories and unification procedures are also available and 
used in many logic programming systems. In particular, there are theoretically co- 
herent languages, such as Prolog III (Colmcrauer 1982), that employ an equality 
theory and unification algorithm based on a theory of rational trees (possibly in- 
finite trees with a finite number of subtrees). As remarked in (Colmerauer 1982), 
complete (i.e., always terminating) unification with the omission of the occurs-check 
solves equations over rational trees. Complete unification is made available by sev- 
eral Prolog implementations. The substitutions computed by such systems are in 
rational solved form and therefore not necessarily idempotent. As an example, the 
substitution {x i— > f{x)}, which is clearly non-idempotent, is in rational solved form 
and could itself be computed by the above algorithms. 

It is therefore important that theoretical work in data-flow analysis makes no 
assumption that the computed solutions are idempotent. In spite of this, most the- 
oretical work on data-flow analysis of logic programming and of Prolog assume the 
occurs-check is performed, thus allowing idempotent substitutions only. In particu- 
lar, (Jacobs and Langen 1992), (Langen 1990), and, more recently, (Cortesi and File 
1999) make this assumption in their proofs of soundness. As a consequence, their re- 
sults do not apply to the analysis of all Prolog programs. A recent exception to this 
is (King 2000) where a soundness result is proved for a domain representing just the 
pair-sharing and linearity information. In this work it is assumed that a separate 
groundncss analysis is performed and its results are used to recover from the preci- 
sion losses incurred by the proposed domain. However, the problem of specifying a 
sound and precise groundness analysis when dealing with possibly non-idempotent 
substitutions is completely disregarded, so that the overall solution is incomplete. 
Moreover, the proposed abstraction function is based on a limit operation that, in 
the general case, is not flnitely computable. 

We have therefore addressed the problem of defining a sound and precise approx- 
imation of the sharing information contained in a substitution in rational solved 
form. 

In particular, we observed that the Sharing domain is concerned with the set 
of variables occurring in a term, rather than with the term structure. We have 
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therefore generalized the notion of idempotence to variable-idempotence. That is, if 
cr is a variable-idempotent substitution and t is any term, then any variable which 
is not in the domain of a and occurs in taa also occurs in ta. Clearly, as illustrated 
by the above example, substitutions generated by unification algorithms without 
the occurs-check may not even be variable-idempotent. To resolve this, we have 
devised an algorithm that transforms any substitution in rational solved form to an 
equivalent (with respect to any equality theory) variable-idempotent substitution. 
For instance, in the example, it would transform cr to {z f{z,y),x i-^ f{z,y)}. 

By suitably exploiting the properties enjoyed by variable-idempotent substitu- 
tions, we show that, for the domain Sharing, the abstract unification algorithm 
aunify is sound with respect to the actually implemented unification procedures 
for all logic programming languages. Moreover, we define a new abstraction func- 
tion mapping any set of substitutions in rational solved form into the corresponding 
abstract descriptions so that there is no need for the analyser to compute the equiv- 
alent set of variable-idempotent substitutions. We note that this new abstraction 
function is carefully chosen so as to avoid any precision loss due to the possible 
non-idempotence of the substitution. 

Note that both the notion of variable-idempotent substitution and the proven 
results relating it to arbitrary substitutions in rational solved form do not depend 
on the particular abstract domain considered. Indeed, we believe that this concept, 
perhaps with minor adjustments, can be usefully applied to other abstract domains 
when extending the soundness proofs devised for idempotent substitutions to the 
more general case of substitutions in rational solved form. 

1.2 Commutativity and Idempotence o/ auniiy 

A substitution is defined as a set of bindings or equations between variables and 
other terms. Thus, for the concrete domain, the order and multiplicity of elements 
arc irrelevant in both the computation and semantics of unification. It is therefore 
useful that the abstraction of the unification procedure should be unaffected by 
the order and multiplicity in which it abstracts the bindings that are present in the 
substitution. Furthermore, from a practical perspective, it is also useful if the global 
abstract procedure can proceed in a different order with respect to the concrete one 
without affecting the accuracy of the analysis results. On the other hand, as sharing 
is normally combined with linearity and freeness domains that are not idempotent 
or commutative (Langen 1990, Bruynooghe and Codish 1993, King 1994), it may be 
asked why these properties are still important for sharing analysis. In answer to this, 
we observe that the order and multiplicity in which the bindings in a substitution 
are analyzed affects the accm-acy of the linearity and freeness information. It is 
therefore a real advantage to be able to ignore these aspects as far as the sharing 
domain is concerned. Specifically, the order in which the bindings are analyzed can 
be chosen so as to improve the accuracy of linearity and freeness. We thus conclude 
that it is extremely desirable that aunify is also commutative and idempotent. 

We found that there was no satisfactory proof of commutativity. In addition, for 
idempotence the only previous result was given in (Langen 1990, Theorem 32) of 



Soundness, Idempotence and Commutativity of Set- Sharing 



5 



the thesis of Langen. However, his definition of abstract unification includes the 
renaming and projection operations and, in this case, only a weak form of idem- 
potence holds. In fact, for the basic aunify operation as defined here and without 
projection and renaming, idempotence has never before been proven. We therefore 
provide here the first published proofs of these properties. 

In summary, this paper, which is an extended and improved version of (Hill, Bag- 
nara and Zaffanella 1998), provides a generalization of the abstraction function for 
Sharing that can be applied to any logic programming language dealing with syn- 
tactic term structures. The results for soundness, idempotence and commutativity 
for abstract unification using this abstraction function are proved. 

The paper is organised as follows. In the next section, the notation and def- 
initions needed for equality and substitutions in the concrete domain are given. 
In Section 3, we recall the definition of the domain Sharing and of the classical 
abstraction function defined for idempotent substitutions. We also show why this 
abstraction function cannot be applied, as is, to non-idcmpotcnt substitutions. In 
Section 4, we introduce variable-idempotence and provide a transformation that 
may be used to map any substitution in rational solved form to an equivalent, 
variablc-idcmpontcnt one. In Section 5, we define a new abstraction function relat- 
ing the Sharing domain to the domain of arbitrary substitutions in rational solved 
form. In Section 6, we recall the definition of the abstract unification for Sharing 
and state our main results. Section 7 concludes. For the convenience of the reader, 
throughout the paper all the proofs (apart from the simpler ones) of the stated 
results are appended to the end of the corresponding section. 



2 Equations and Substitutions 

In this section we introduce the notation and some terminology concerning equality 
and substitutions that will be used in the rest of the paper. 



2.1 Notation 

For a set S, p{S) is the powerset of S, whereas pi{S) is the set of all the finite 
subsets of S. The symbol Vars denotes a denumerable set of variables, whereas 
Tvars denotes the set of first-order terms over Vars for some given set of function 
symbols. It is assumed that there are at least two distinct function symbols, one 
of which is a constant (i.e., of zero arity), in the given set. The set of variables 
occurring in a syntactic object o is denoted by vars (a). To simpliiy the expressions 
in the paper, any variable in a formula that is not in the scope of a quantifier is 
assumed to be universally quantified. To prove the results in the paper, it is useful 
to assume a total ordering, denoted with '<', on Vars. 
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2.2 Substitutions 

A substitution is a total function a : Vars Tvars that is the identity almost 
everywhere; in other words, the domain of a, 

doni((7) { X e Vars | (t{x) ^ x }, 

is finite. Given a substitution a: Vars Tvars we overload the symbol 'a' so as 
to denote also the function a: Tvars Tvars defined as follows, for each term 
t e T, 



Vars • 



a{t) { 



t, if t is a constant symbol; 

a{t), if t e Vars; 

f{a{ti),... ,a{tn)), i{t = f{ti,...,tn). 



If t € Tvars, we write ta to denote a{t) and t[x/s] to denote t{x s}. 

If X 6 Vars and s £ Tvars \ {x}, then x s is called a binding. The set of all 
bindings is denoted by Bind. Substitutions are syntactically denoted by the set of 
their bindings, thus a substitution a is identified with the (finite) set 

{ X 1-^ ct(x) I X € dom(CT) } . 

Thus, varsia) is the set of variables occurring in the bindings of a and we also 
define the set of parameter variables of a substitution a as 

param(cr) =* vars{a) \dom((T). 

A substitution is said to be circular if, for n > 1, it has the form 

{xi I > X2, • • • 5 ^n— 1 ' ^ "^ni ' ^ ^^l}? 

where distinct variables. A substitution is in rational solved form if it 

has no circular subset. The set of all substitutions in rational solved form is denoted 
by RSubst. A substitution a is idempotent if, for all t € Tyars, we have taa = ta. 
The set of all idempotent substitutions is denoted by ISubst and ISubst C RSubst. 

Example 1 

The following hold: 

{x I— > t/, y a} e RSubst \ ISubst, 

{x I— > a, y a} e ISubst, 
{x y, ?/ 1— > g{y)} & RSubst \ ISubst, 
{x y,y g{x)} G RSubst \ ISubst, 

{x I— > y, y H^- x} ^ RSubst, 
{x I— > y, y I— > X, a} ^ RSubst. 

We have assumed that there is a total ordering '<' for Vars. We say that cr S 
RSubst is ordered (with respect to this ordering) if, for each binding {v t-^ w) G a 
such that w G param(c7) , we have w <v. 
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The composition of substitutions is defined in the usual way. Thus t o a is the 
substitution such that, for all terms t G Tyars, 

{Toa){t)=T{a{t)) 

and has the formulation 

T o a = { a; i-» xar | x G dom{a),x ^ xar }u{xi->a;r|xe dom(T) \ dom(cr) }. 

(1) 

As usual, a" denotes the identity function (i.e., the empty substitution) and, when 
i > 0, CT* denotes the substitution {a o ct'"^). 

2.3 Equations 

An equation is of the form s = t where s,t G Tyars- Eqs denotes the set of all 
equations. A substitution a may be regarded as a finite set of equations, that is, as 
the set { a; = i I (a; i— > G a } . We say that a set of equations e is in rational solved 
form if{si-^t| {s = t) £ e} G RSubst. In the rest of the paper, we will often 
write a substitution a £ RSubst to denote a set of equations in rational solved form 
(and vice versa). 

We assume that any equality theory T over Tvars includes the congruence axioms 



denoted by the following schemata: 

s = s, (2) 

s = t t = s, (3) 

r = sAs = t—^r = t, (4) 

Si=tiA---ASn=tn^ /(Si, ... ,Sn) = f{tl, ... , tn). (5) 



In logic programming and most implementations of Prolog it is usual to assume an 
equality theory based on syntactic identity. This consists of the congruence axioms 
together with the identity axioms denoted by the following schemata, where / and 
g are distinct function symbols ov n ^ m: 

f{si, ... ,Sn) = f{tl, . . . ,tn) ^ Si = ti A ■ ■ ■ A Sn = tn, (6) 
-.(/(si,... ,S„) =g{ti,... ,tm)). (7) 

The axioms characterized by schemata (6) and (7) ensure the equality theory de- 
pends only on the syntax. The equality theory for a non-syntactic domain replaces 
these axioms by ones that depend instead on the semantics of the domain and, in 
particular, on the interpretation given to functor symbols. 

The equality theory of Clark (Clark 1978) on which pure logic programming 
is based, usually called the Herbrand equality theory, is given by the congruence 
axioms, the identity axioms, and the axiom schema 

Vz G Vars : Wt G {Tvars \ Vars) : z G vars{t) -^{z = t). (8) 

Axioms characterized by the schema (8) are called the occurs-check axioms and are 
an essential part of the standard unification procedure in SLD-resolution. 
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An alternative approach used in some implementations of Prolog, does not re- 
quire the occurs-check axioms. This approach is based on the theory of rational 
trees (Colmerauer 1982, Colmerauer 1984). It assumes the congruence axioms and 
the identity axioms together with a uniqueness axiom, for each substitution in ra- 
tional solved form. Informally speaking these state that, after assigning a ground 
rational tree to each parameter variable, the substitution uniquely defines a ground 
rational tree for each of its domain variables. Note that being in rational solved form 
is a very weak property. Indeed, unification algorithms returning a set of equations 
in rational solved form are allowed to be much more "lazy" than one would usu- 
ally expect (e.g., see the first substitution in Example 1). We refer the interested 
reader to (Jaffar, Lassez and Maher 1987, Keisu 1994, Maher 1988) for details on 
the subject. 

In the sequel we will use the expression "equality theory" to denote any con- 
sistent, decidablc theory T satisfying the congruence axioms. We will also use the 
expression "syntactic equality theory" to denote any equality theory T also satis- 
fying the identity axioms.^ When the equality theory T is clear from the context, 
it is convenient to adopt the notations a =^ r and a ^^=^ r, where a, r are sets 
of equations, to denote T h V(cr r) and T h V((t ^ r), respectively. 

Given an equality theory T, and a set of equations in rational solved form cr, we 
say that a is satisfiable in T if T h V Vars \ dom(cr) : 3 dom(cr) . cr. If T is a syntactic 
equality theory that also includes the occurs-check axioms, and a is satisfiable in 
T, then we say that a is Herbrand. 

Given a satisfiable set of equations e € pf (Eqs) in an equality theory T, then a 
substitution a € RSubst is called a solution for e in T if a is satisfiable in T and 
T h V(ct ^ e). If vars{ij) C vars{e)^ then a is said to be a relevant solution for e. 
In addition, a is a most general solution for e inT HT \- V(c7 <-> e). In this paper, 
a most general solution is always a relevant solution of e. 

Observe that, given an equality theory T, a set of equations in rational solved 
form may not be satisfiable in T. For example, 3a; : {x = /(x)} is false in the Clark 
equality theory. 

Lemma 1 

Suppose T is an equality theory, a G RSubst is satisfiable in T , x G Vars \ dom((T), 
and a <E T0. Then, a' '= a U {x 1-^ a} € RSubst and a' is satisfiable in T. 

Proof 

As X ^ dom(c7) and a G RSubst and a G 7^ , it follows that a' = a U {x t-^ a} G 
RSubst. 

Since a is satisfiable in T, 

T h Wars \ dom{a) : 3 dom{a) . a. 

^ Note that, as a consequence of axiom (7) and the assumption that there are at least two distinct 
function symbols in the language, one of which is a constant, there exist two terms ai, 02 6 70 
such that, for any syntactic equality theory T, we have T \- ai ^ ai. 
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Moreover, by the congruence axiom (2), 
T h Wars \ {a;} :3x .{x = a}. 

Hence, 

T h Wars \ (dom((j) U {x}) : 3(dom(a) U {x}) . aU{x = a}. 
Thus a' = a L) {x a} is satisfiable in T. □ 

Syntactically we have shown that any substitution in RSuhst may be regarded 
as a set of equations in rational solved form and vice versa. The next lemma shows 
the semantic relationship between them. 

Lemma 2 

If T is an equality theory and a G RSuhst, then, for each t G Tvarsi 

T\-y{a ^ {t = ta)). 

Proof 

We assume the congruence axioms hold and prove that, for any t € Tvarsj we have 
a =^ {t = to}. The proof is by induction on the depth of t. 

Suppose, first that the depth of t is one. If t is a variable not in dom(0-) or a 
constant, then ta = t and the result follows from axiom (2). li t & dom(cr), then, 
for some r G Tvars, {t\-* r) & a. Thus a =^ {t = ta}. 

If the depth of t is greater than one, then t has the form /(si, . . . , s„) where 
si, . . . , s„ G Tvars have depth less than the depth of t. By the inductive hypothesis, 
for each i = 1, . . . , n, we have a => {sj = Sia}. Therefore, applying axiom (5), 
we have a {t = ta}. □ 

As is common in papers involving equality, we overload the symbol '=' and use 
it to denote both equality and to represent syntactic identity. The context makes 
it clear what is intended. 



3 The Set-Sharing Domain 

In this section, we first recall the definition of the Sharing domain and present the 
(classical) abstraction function used for dealing with idempotent substitutions. We 
will then give evidence for the problems arising when applying this abstraction 
function to the more general case of substitutions in rational solved form. 

3.1 The Sharing Domain 

The Sharing domain is due to .Jacobs and Langcn (Jacobs and Langcn 1989). How- 
ever, we use the definition as presented in (Bagnara et al. 1997) where the set of 
variables of interest is given explicitly. 
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Definition 1 

(The set-sharing lattice.) Let 

SG = pi{Vars)\{0} 



and let 



SH = p{SG). 

The set-sharing lattice is given by the set 

SS = { {sh, U)\sh€ SH, U G pf{ Vars),yS € sh : S C U } U {±,T}, 
which is ordered by '^ss' defined as follows, for each d, (s/ii, f/i), (s/12, U2) € SS: 

_L <ss d, 

d ^ss T, 

{shuUi) ^ss (s/i2, U2) ^ {Ui = U2) A (s/ii C Sh2). 

It is straightforward to sec that every subset of SS has a least upper bound with 
respect to :<ss- Hence SS is a complete lattice.'^ The lub operator over SS will be 
denoted by 'U'. 



An element sh of SH encodes the sharing information contained in an idempotent 
substitution a. Namely, two variables x and y must be in the same set in sh if some 
variable occurs in both xa and ya. 

Definition 2 

(Classical sg and abstraction functions.) sg: ISuhst x Vars pi{Vars), called 
sharing group function, is defined, for each a G ISuhst and each v € Vars, by 



The concrete domain p{ISubst) is related to SS by means of the abstraction function 
a/: p{ISubst) x pf{Vars) SS. For each S G p{ISubst) and each U G pi{Vars), 



where aj : ISubst x pf ( Vars) — > SS is defined, for each substitution a G ISubst and 
each U G pi{Vars), by 



^ Notice that the only reason we have T 6 55 is in order to turn 55 into a lattice rather than a 



3.2 The Classical Abstraction Function for ISuhst 



sg(a,v) { y G Vars | v G vars{ya) }. 




CPO. 
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The sharing group function sg was first defined by Jacobs and Langen (Jacobs and 
Langen 1989) and used in their definition of a concretisation function for SH. The 
function aj corresponds closely to the abstract counterpart of this concretisation 
function, but explicitly includes the set of variables of interest as a separate argu- 
ment. It is identical to the abstraction function for Sharing defined by Cortesi and 
File (Cortesi and File 1999). 

In order to provide an intuitive reading of the sharing information encoded into 
an abstract element, we should stress that the analysis aims at capturing possi- 
ble sharing. The corresponding definite information (e.g., definite groundness or 
independence) can be extracted by observing which sharing groups are not in the 
abstract clement. As an example, if we observe that there is no sharing group con- 
taining a particular variable of U, then we can safely conclude that this variable is 
definitely ground (namely, it is bound to a term containing no variables). Similarly, 
if we observe that two variables never occur together in the same sharing group, 
then we can safely conchidc that they arc independent (namely, they are bound 
to terms that do not share a common variable). For a more detailed description of 
the information contained in an element of SS, we refer the interested reader to 
(Bagnara et al. 1997, Bagnara et al. 2001). 

Example 2 

Assume U = {a;i,a;2,a;3,a;4} and let 



From this abstraction we can safely conclude that variable X4 is ground and variables 
X2 and X3 are independent. 



To help motivate the approach we have taken in adapting the classical abstraction 
function to non-idempotent substitutions, we now explain some of the problems 
that arise if we apply aj, as it is defined on ISubst, to the non-idempotent sub- 
stitutions in RSubst. Note that these problems are only partially due to allowing 
for non-Herbrand substitutions (that is substitutions that are not satisfiable in a 
syntactic equality theory containing the occurs-check axioms). They are also due to 
the presence of non-idempotent but Herbrand substitutions that may arise because 
of the potential "laziness" of unification procedures based on the rational solved 
form. 

We use the following substitutions to illustrate the problems, where it is assumed 



{Xl ^ f{X2,X3),X4 ^ a} 



so that its abstraction is given by 




3.3 Towards an Abstraction Function for RSubst 



12 



P. M. Hill, R. Bagnara and E. Zaffanella 



that the set of variables of interest is U — {xi,a;2,a::3,a;4}. Let 

£71 = {xi f{xi)}, 
CT2 = {X3 >-> X4}, 

(74 = {xi 1-^ Xi, X2 ^ Xi, xa 1-^ 0:4} 

so that we have 

ai{BS,U) = ai{ai,U) = ({{xi}, {X2}, {xs}, {X4}}, I/), 
ai{a2,U) = ai{(Tz,U) = {^{{xi},{x2),{xz,Xi}],U^, 
ai{ai,U) = (^{{xi,X2,X3,X4}},Uy 

The first problem is that the concrete equivalence classes induced by the classical 
abstraction function on RSubst are much coarser than one would expect and hence 
we have an unwanted loss of precision. For example, in all the sets of rational 
trees that are solutions for cji, the variable xi is ground. However, the computed 
abstract element fails to distinguish this situation from that resulting from the 
empty substitution, where all the variables are free and un-aliased. Similarly, we 
have the same abstract element for both (72 and 0-3 although, xi, X2 and X3 are 
independent in c72 only. 

The second problem is quite the opposite from the first in that the abstraction 
function distinguishes between substitutions that are equivalent (with respect to 
any equality theory). For example, 0-3 and 1T4 are equivalent although the abstract 
elements are distinct. Note that the two problems described here are completely 
orthogonal although they can interact and produce more complex situations. 



4 Variable-Idempotence 

In this section we define a new class of substitutions based on the concept of 
variable-idempotence. Variable-idempotent substitutions are then related to sub- 
stitutions in rational solved form by means of an equivalence preserving rewriting 
relation. 

4-1 Variable-idempotent Substitutions 

Recall that, for substitutions, the definition of idempotence requires that repeated 
applications of a substitution do not change the syntactic structure of a term. 
However, a sharing abstraction such as aj is only interested in the variables and 
not in the structure that contains them. Thus, an obvious way to relax the definition 
of idempotence to allow for a non-Herbrand substitution is to ignore the structure 
and just require that its repeated application leaves the set of free variables in a 
term invariant. 
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Definition 3 

(Variable-Idempotence.) A substitution a is said to be variable-idempotent if 
a G RSubst and, for each t G Tvars , 

vars{taa) \ doni(ij) = vars{ta) \ dom{a). 

The set of all variable-idempotent substitutions is denoted by VSubst. 

Note that, as the condition vars(ta) \ dom((7) C vars{taa) is trivial and holds for 
all substitutions, we have a G VSubst if and only if cr e RSubst and 

vars{taa) \ dom(cr) C vars{ta). (9) 

Also note that any idempotent substitution is also variable-idempotent, so that 
ISubst C VSubst C RSubst. 

Example 3 

Consider the following substitutions which are all in RSubst. 

CTi = {a; i-> f{x)] e VSubst \ ISubst, 

(72 = {a; I— »• f{y), y ^ z] ^ VSubst, 

(73 = {a; I— »• f{z), 2/ ^ z} G ISubst, 

a4 = {x 1-^ z,y f{x, y)} ^ VSubst, 

a5 = {x 1-^ z,y 1-^ f{z, y)} e VSubst \ ISubst. 

Note that (72 is equivalent (with respect to any equality theory) to the idempotent 
substitution a^; and is equivalent (with respect to any equality theory) to the 
substitution cts which is variable-idempotent but not idempotent. 

The next result provides an alternative characterization of variable-idempotence. 

Lemma 3 

Suppose that a G RSubst. Then cr e VSubst if and only if, for all {x ^ r) & a, 
vars{ra) \ dom((7) = vars{r) \ dom((7). 

Proof 

Suppose first that a G VSubst and that {x i-^ r) G a. Then 

vars{x(7(j) \ dom(c7) = vars{x(j) \dom{a) 

and hence, vars{ra) \ dom.{a) = vars{r) \ dom.{a). 

Next, suppose that for all (x r) G a, vars{ra) \ dom(f7) = vars{r) \ dom{a). 
Let t G Tvars- We will show that vars{taa) \ dom(cr) = vars{ta) \ dom((T) by 
induction on the depth of t. If t is a constant or t G Vars \ dom(cr), then the 
result follows from the fact that ta = t. If t G dom(f7), then the result follows 
from the hypothesis. Finally, ift = f{t\, ... , t„), then, by the inductive hypothesis, 
vars(tiaa) \ dom((7) = varsiticr) \ dom((7) for i = 1, . . . , n. Therefore we have 
vars{taa) \ dom(a) = vars{ta) \ dom((7). Thus, by Definition (3), as cr £ RSubst, 
a G VSubst. □ 
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Note that, as a consequence of Lemma 3, any substitution consisting of a single 
binding is variable-idempotent. Note though that we cannot assume that every 
subset of a variable-idempotent substitution is variable-idempotent. 

Example 4 
Let 

CTi = {xi 1-^ X2, X2 >->■ g{x3), X3 i-> /(a;3)}, 

0-2 = {X3 fixs)}, 

0-3 = CTi \ (72 = {xi 1-^ a;2,a;2 i-^ gixs)}- 

It can be observed that ai,a2 G VSubst. Also note that 0-3 ^ VSubst, because we 
have X3 € wars ( a; 1(73 (73) \ dom(c73) but X3 ^ vars{xia3) \ dom(CT3). 

On the other hand, a variable-idempotent substitution does enjoy the following 
useful property with respect to its subsets. 

Lemma 4 

If CT e VSubst and t e Tyars, then, for all a' C a, 

vars{taa') \ dom(0-) = vars{ta) \ dom(cr). 

Proof 

Observe that, since <t' C a, the relation vars{ta) \ dom((7) C vars{taa') is trivial. 

To prove the opposite relation, suppose that y S vars{taa')\dom.{a). Then there 
exists X S vars{ta) such that y e vars{xa'). Now, if a; ^ dom(cr'), then x = y 

and y S vars{ta). On the other hand, if x 6 dom(cr'), then xa' = xa so that 
y G vars{taa) \ dom((j) and hence, as cr £ VSubst, y G vars{ta). □ 

We note that this result depends on the definition of variable-idempotence ignoring 
the domain elements of the substitution. 

Example 5 
Let 

(J = {x^ f{x,y),y>-^ a}. 

Then a G VSubst but 

vars{xa) = {x, y}, 
vars{xaa) = {x,y}, 
vars[xa{y 1— > a}) = {x}. 

We now state two technical results that will be needed later in the paper. Note 
that, when proving these results at the end of this section, we require that the equal- 
ity theory also satisfies the identity axioms. They show that equivalent, ordered, 
variable-idempotent substitutions have the same domain and bind the domain vari- 
ables to terms with the same set of parameter variables. 

Lemma 5 

Suppose that T is a syntactic equality theory, r, cr G VSubst are ordered and satis- 
fiable in T and T h V(t a). Then dom(a-) C dom(T). 
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Lemma 6 

Suppose that T is a syntactic equality theory, t, cr e VSubst are satisfiable in T and 
T h V(r — > a). In addition, suppose s,t G Tvars are such that T h V(t — > (s = t)). 
Then, if w <E vars{s) \ dom(T), there exists a variable z e vars{ta) \ doni(cr) such 
that V e vars{zT). 

4-2 S -transformations 

A useful property of variable-idempotent substitutions is that any substitution can 
be transformed to an equivalent (with respect to any equality theory) variable- 
idempotent one. 

Definition 4 

(<S-transformation.) The relation i — > C RSubst x RSubst, called S-step, is defined 
by 

{x t) & a (y s) e (T X ^ y 
a 1-^ {a \ {y 1-^ s}) U {y 1-^ s[x/t]} 
s s 

If we have a finite sequence of iS-steps ai i — > • ■ • i — > an mapping ai to (t„, then 
we write ai i — >* cr„ and say that ai can be rewritten, by <S-transformation, to 0-„. 

Example 6 
Let 

(70 = [xi ^ f(X2),X2 ^ g{X3,X4),X3 Xi} . 

Observe that uo is not variable-idempotent since vars{xi(7Q) \ {xi, X2, x:^} = but 
vars{xia()<7Q) \ {xi, X2, xs} = {x^}. By considering all the bindings of the substitu- 
tion, one at a time, and applying the corresponding iS-step to all the other bindings, 
we produce a new substitution 0-3 . 

Co = { xi ^ f{x2) ,X2 ^ g{x:i,Xi),Xz ^ Xi] 

(Ti = {xi ^ f{x2), X2 ^ g{x3,X4) ,X3 ^ f{x2)}, 

CT2 = {Xl ^ f{g{x3,X4)),X2 ^ g{x3,X4), X3 1-^ f {g{x3, X4)) } , 

03 = [xi ^ f{g{f{g{x3,X4)),X4)), 

X2 1-^ 9{f{g{X3,X4)),X4),X3 f ig{X3, X4))} ■ 

Then 

(To I > ai I > (72 I — > (73. 

Note that c7o <J=^ (73 and, for any r C (73, the substitution r is variable-idempotent. 
In particular, (73 is variable-idempotent. 

The next two theorems, which are proved at the end of this section, show that 
we need only consider variable-idempotent substitutions. 
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Theorem 1 

Suppose (T e RSubst and a i — >* a'. Then a' e RSubst, dom(0-) = dom{a'), 
vars{a) = vars{a') and, if T is any equality theory, then T h V(cr a'). 

Theorem 2 

Suppose a G RSubst. Then there exists a' e VSubst such that cr i — >* a' and, for 
all T C 0-', r e FS'tifesi. 

As a consequence of Theorem 2, we can transform any substitution in rational solved 
form to a substitution for which it and all its subsets are variable-idempotent. Thus, 
substitutions such as ai in Example 4 can be disregarded. The proof of this theorem 

formalizes the rewriting process informally described in Example 6. 

The following result concerning composition of substitutions will be needed later. 

Lemma 7 

Let T,a G VSubst, where dom(a) fl vars{T) = 0. Then t o a has the following 
properties. 

1. TV- V((t o cr) (r U cr)), for any equality theory T; 

2. dom(T o ct) = dom(T U cr); 

3. T o (7 e VSubst. 

4-3 The Abstraction Function for VSubst 

With these results, it can be seen that we need to consider variable-idempotent 
substitutions only. Moreover, in this case, one of the causes of the problems out- 
lined in Section 3.3, due to the possible "laziness" of the unification algorithm, is no 
longer present. As a consequence, it is now suflacient to address the potential loss 
in precision due to the non-Her brand substitutions. The simple solution is to define 
a new abstraction function for VSubst which is the same as that in Definition 2 
but where any sharing groTip generated by a variable in the domain of the substi- 
tution is disregarded. This new abstraction function works for variable-idempotent 
substitutions and no longer suffers the drawbacks outlined in Section 3.3. 

Therefore, at least from a theoretical point of view, the problem of defining a 
sound and precise abstraction function for arbitrary substitutions in rational solved 
form would have been solved. Given a substitution in RSubst, we would proceed in 
two steps: we first transform it to an equivalent substitution in VSubst and then 
compute the corresponding description by using the modified abstraction fimction. 
However, from a practical point of view, we need to define an abstraction function 
that directly computes the description of a substitution in RSubst in a single step, 
thus avoiding the expensive computation of the intermediate variable-idempotent 
substitution. We present such an abstraction function in Section 5. 

4-4 Proofs of Lemmas 5, 6 and 7 and Theorems 1 and 2 

To prove Lemmas 5 and 6, it is useful to first establish the following two properties 
of variable-idempotent substitutions. 
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Lemma 8 

Suppose that a e VSubst, r € Tvars and, for all i > 0, ra^ £ Vars. Then we have 
ra € Vars \ doin(cr). 

Proof 

As a has no circular subset and dom(o-) is finite, there exists a j > 1 such that 
ra^ = ra^~^^ and hence, ra^ G Vars\dom{a). As a is variable-idempotent, we have 

{ra^} = vars{ra^) \ dom{a) 
= vars {r a) \ doni(cr) 
= {ra} \ doni(cr). 

Hence ra G Vars \ dom.{a). □ 



Lemma 9 

Suppose that a e VSubst and w, r e Tvarsi where v e Vars \ dom(cr) and, for any 
syntactic equality theory T, T h \l{a {v = r}) . Then v = ra. 



Proof 

We assume that the congruence and identity axioms hold. Let ai,a2 G T0 have 
distinct outer-most symbols so that, by the identity axioms, T \- ai ^ a2- By 
Lemma 8, either ra € Vars \ dom(cr) or, for some j > 0, ra^ ^ Vars. We consider 
each case separately. 

If, for some j > 0, ra^ ^ Vars, then, as ai and 02 have distinct outer-most 
symbols, there exists an i e {1,2} such that a^ and ra^ have distinct outer-most 
symbols. Thus, by the identity axioms, =/= raK Let a' = a yj {v = ai\. It fol- 
lows from Lemma 1 that, as v ^ dom(iT) and a is satisfiable, a' G RSubst and is 
satisfiable. By Lemma 2 and the congruence axioms, a => {v = ra^}. However, 
a' =^ a, so that a' => {v = ra^ ,v = ai}. Thus, by the congruence axioms, we 
have a' =^ {a^ = ra^}, which is a contradiction. 

Suppose then that ra G Vars \ dom(cr). li v ra, then it follows from Lemma 1 
that a' = a U {v = a-i,ra = a2} & RSubst and, as a is satisfiable, a' is satisfiable. 
By Lemma 2 and the congruence axioms, a =^ {v = ra}. However, a' a, so 
that a' {v = ra,v = ai,ra = 02}. Thus, by the congruence axioms, we have 

a' {ai = 02}, which is a contradiction. Hence v = ra as required. □ 
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Proof of Lemma 5. 

We assume that the congruence and identity axioms hold. To prove the result, we 
suppose that there exists v € dom{a) \ dom(T) and derive a contradiction. 

By hypothesis, r =^ a. Thus, using Lemma 2 and the congruence axioms, we 
have, for any i >0, t =^ {v = va^}. By Lemma 9, for alH > 0, -y = v&'-t so that 
v<j^ e Vars. By Lemma 8, va ^ dom(cr), so that, as a is ordered and v G dom(cr), 
va < V. In particular, va ^ v, so that as var = v and r is ordered, we would have 

V < va, which is a contradiction. □ 

Proof of Lemma 6. 

We assume that the congruence and identity axioms hold. Note that, by the hy- 
pothesis, r a and r =^ {s = t\ so that, using Lemma 2 and the congruence 
axioms, we have r =^ {s = ta^} and r =^ {tar^ = s}, for all j, k > 0. 

Let V G vars{s) \ dom(T). We prove, by induction on the depth d of s, that there 
exists z G vars{ta) \ dom(cr) such that v S vars{zT). The base case is when d = 1 
so that s — V. Now, for each j > 0, t {v = ta^} and hence, by Lemma 9 (as 

V ^ dom(T)), V = ta^T. As a consequence, ta^ £ Vars for all j >0 and v = tar. By 
Lemma 8, ta € Vars \ dom(c7). Thus, we define z = ta. 

For the inductive step, we assume that > 1 so that, for some n > 1, we have 
s = /(si,... ,s„) and, for some i e {!,••■ ,n}, v £ vars{si) and si has depth 
d — 1. By Lemma 8, either ta € Vars \ dom(cr) or there exists a j > such that 
ta^ ^ Vars. 

First, suppose that ta e Vars \ dom(cr). Now, r {tar = s} so that, as 

ST ^ Vars, by Lemma 9, we have tar ^ Vars \ dom(T). Thus, by Lemma 8, there 
exists A; > 1 such that tar^ ^ Vars. Then, using the identity axioms, we have 
tar'' = /(n, . . . ,r„) and r =^ {sj = r^}. By the inductive hypothesis (letting a 
be the empty substitution), we have v € vars{riT). However, vars{ri) C vars{taT^) 
so that V e vars{taT'^^^). As r G VSubst and v ^ dom(T), v G vars{taT). Thus, in 
this case, let z = ta. 

Secondly, suppose that there exists a j > such that ta^ ^ Vars. Then, as 
r =^ {s = ta^}, it follows from the identity axioms that ta^ = f(ti, . . . ,tn) and 
r {si = ti}. By the inductive hypothesis, there exists z G vars{tia) \ dom(c7) 

such that V G vars{zT). However, vars{tia) C vars{ta^~^^) so that we must have 
z G vars{ta^~^^) \ dom(0-). As ct e VSubst, z G vars{ta) \ dom(o-) as required. □ 

To prove Theorem 1, we need to show that the result holds for a single <S-step. 

Lemma 10 

Let T be an equality theory and suppose that a G RSubst and a i — > a'. Then 
a' G RSubst, dom(cr) = dom(cr'), vars{a) = vars{a'), and T h V(c7 <-> a'). 

Proof 

Since a i — > a' , there exists x,y G dom(CT) with x ^ y such that {x ^ t),{y ^ s) G a 
and a' = {a\{y ^ s}) U {y i— > s[a;/t]}. li x ^ vars{s), a = a' and the result is 
trivial. Suppose now that x G vars{s). We define 

ao'^= a\{x = t,y = s}. 
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Hence, as it is assumed that x y, 

(7 = aQ U [x ^ t, y ^ s} , (10) 
a' = aoU{x>-^t,y ^ .s[x/t]}. (11) 

We first show that a' G RSuhst and dom((j) = doni(cr'). If s ^ Vars, then 
8[x/t] ^ Vars so that dom((7) = dom((7'). Also, as cr has no circular subset, a' has 
no circular subset and a' G RSuhst. If s e Vars, then s = x and s[a;/t] = Thus, as 
cr = (To U {a; I— > y a;} has no circular subset, t ^ y so that dom((T) = doni((T'). 
Moreover, neither ao U {x i— » t} nor ctq U {y have circular subsets. Hence a' 
has no circular subset. Thus a' G RSubst. 

Now, since 

[vars{s) U vars{t)) \ dom{a) = vars(^s[x/t] U vars{t)) \ dom{a), 

it follows that vars{a) = vars{a'). 

Therefore, it remains to show that, for any equality theory T,T \- V((J ^ a'). To 
do this, we assume that the congruence axioms hold, and show that a a' . By 

Lemma 2, we have 

{x = t} => {s = s[x/t\}. 

Thus, using the congruence axiom (4), we have 

{x = t,y = s} {x = t,y = s,s = s[xlt]] 
{x = t,y = s[x/t\}. 

Similarly, using congruence axioms (3) and (4), we have 

{^x = t,y = s[x/tW {^x = t,y = s[x/t],s = s[x/t\\ 

{x = t,y = s). 

Thus 

{x = t,y = s} \^x = t,y = s[x/t]] . 

It therefore follows from (10) and (11) that a cr'. □ 

The condition x ^ y in the proof of Lemma 10 is necessary. For example, suppose 
cr = {a; ^-> /(a;)} and cr' = |a; i-^ /(/(a;))}. Then we do not have ct' => a. Note 
however that this implication will hold as soon as we enrich the equality theory T 
with cither the occurs-check axioms or the uniqueness axioms of the rational trees' 
theory. 

Proof of Theorem 1. 

The proof is by induction on the length of the sequence of <S-steps transforming cr 
to cr'. The base case is the empty sequence. For the inductive step, the sequence 

has length n > and there exists cri such that cr i — > cti \ — >* a' and cri i — >* cr' has 
length n — 1. By Lemma 10, cri G RSuhst, dom(cr) = dom(cri), vars{a) = vars{ai) 
and T h V(cr <-> cri). By the inductive hypothesis, cr' e RSubst, dom(cri) = dom(iT'), 
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vars{ai) = vars{a') and T h V(cti ^ a'). Hence we have dom.[a) = dom(c7'), 
vars{a) = vars{a'), and T h V(c7 a'). □ 

Proof of Theorem 2. 

To prove the theorem, we construct an <S-transforniation and show that the resulting 
substitution has the required properties. 

Suppose that {xi, . . . , Xn} — dom(CT), aa — a and, for each j — 0, . . . , n, 

where, if j > 0, tjj = tjj-i and, for each i = 1, . . . , n with i ^ j, we have 

It follows from the definition of aj that, foi j = 1, . . . , n , aj can be obtained 
from aj-i by two sequences of <S-steps of lengths j — 1 and n — j + 1: 

S S j-l j S S n 
= crj_i I >■■■' > cr^_i = crj_i i > • • • i > aj_^ = tTj, 

where, for i = 1, . . . , n with i j, 

= {a}z\ \ {xi Uj-i}) U {xi Uj-i[xj/tjj]} 

Hence, by Theorem 1, ai,. . . ,an € RSubst. 

We next show, by induction on j, with < j < n, that, for each i = 1, . . . , n 
and each /i = 1, . . . , j, we have vars{tij) = vars(tij[xh/th,j]) ■ 

For the base case when j = there is nothing to prove. Suppose, therefore, that 
^ < j <n and that, for each i = 1, . . . , n and h = 1, ... , j — 1, 

vars{tij-i) = vars{tij-i[xh/th,j-i]) . 

Now by the definition of tkj where 1 < k < n, k ^ j , we have 

vars{tk,j) = vars{tk,j-i[xj /tjj]) . (12) 

Also, since a substitution consisting of a single binding is variable-idempotent, 

vars{tjj) = vars{tjj[xj/tjj]) 

SO ijliciijj cis tj^j — 1? 

vars{tjj) = vars (f^- j_i [xj/t^j]). (13) 

Thus, by (12) and (13), for all k such that 1 < fc < n, we have 

vars{tk,j) = vars{tk,j-i[xj/tjj]). (14) 
Therefore, for each i = 1, . . . , n and h = 1, ■ . ■ , j, using (14) and the inductive 
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hypothesis, we have 

vars [Uj [xh/thj]) = vars (tij-i [xj/tjj] [xh/th,j-i[xj/tjj]\ ) 

= vars{tij-i[xh/th,j-i][xj/tjj]) 
= vars{Uj-i[xj/tjj]) 
= vars{tij). 

Letting j = n we obtain, for each i,h = 1, . . . , n, 

vars{ti^n[xh/th,n]) = vars{ti^n)- 
Therefore, for all r C (7„ and each i = 1, . . . , n, 

vars{ti^nT) = vars{ti^„). 

Thus, by Lemma 3, for all r C a-„, r € VSubst. The result follows by taking a' = an- 
□ 

Proof of Lemma 7. 

Since r, cr G VSubst and dom(cr) fl vars{T) = 0, we have that (r U a) G RSuhst. 
It follows from Eq. (1) that t o a can be obtained from (r U cr) by a sequence of 
<S-steps so that, by Theorem 1, we have Properties 1 and 2. 

To prove Property 3, we suppose that, for some v £ dom(T o cr), there exist 
w G vars{va), x £ vars{wT) and y G vars{xa) such that z G vars{yT) \ dom(T o cr). 
We need to prove that z G vars{vaT). 

It follows from Property 2, that z ^ dom(cr) and z ^ dom(T). Suppose first that 
x ^ dom(cr). Then y = x and hence z G vars{vaTT). Therefore, as r G VSubst 
and z ^ dom(r), wc can conclude z G vars{vaT). Thus, we now assume that 
X G dom(cr). As dom(CT) fl vars{T) = 0, wc have x (f. vars{T), so that x = w and 
hence, y G vars(vaa). If y ^ dom(T) we have y = z, so that y ^ dom(a). On the 
other hand, if y G dom(T) then, by the hypothesis, y ^ dom{a). Thus, in both 
cases, as cr G VSubst, wc obtain y G vars{va) and hence z G vars{vaT). It follows, 
using Eq. (9), that Property 3 holds. □ 



5 The Abstraction Function for RSubst 

In this section we define a new abstraction function mapping arbitrary substitutions 
in rational solved form into their abstract descriptions. This abstraction function 
is based on a new definition for the notion of occurrence. The new occurrence oper- 
ator occ is defined on RSubst so that it does not require the explicit computation 
of intermediate variable-idempotent substitutions. To this end, it is given as the 
fixed point of a sequence of occurrence functions. The occ operator generalises the 
sg operator, defined for ISubst, coinciding with it when applied to idempotent sub- 
stitutions. 
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Definition 5 

(Occurrence functions.) For each n e N, occ„: RSubst x Vars — » pf(Vars), 
called occurrence function, is defined, for each a G RSubst and each v G Vars, by 

occo (cr, v) {v} \ dom(a-) 

and, for n > 0, by 

occ„(0-, '= {y G Vars | vars{ya) f) occ„_i(cr, f) ^ }. 
The following monotonicity property for occ„ is proved at the end of this section. 
Lemma 11 

If n > 0, then, for each u G RSubst and each v G Vars, 

OCC„_i(fT,t;) C occ„(cr,ti). 

Note that, by considering the substitution {u v.v w}, it can be seen that, 
if we had not excluded the domain variables in the definition of occq, then this 
monotonicity property would not have held. 

For any n, the set occ„ (cr, v) is restricted to the set {v} U vars{(T). Thus, it follows 
from Lemma 11, that there is an € = i{a,v) € N such that occe{a,v) = ocCn{cr,v) 
for all n> £. 

Definition 6 

(Occurrence operator.) For each a G RSubst and v G Vars, the occurrence 
operator occ: RSubst x Vars — > pf{Vars) is given by 

occ(cr, v) occ^(cr, v) 
where ^ G N is such that occ^(a-, v) = occ„(cr, v) for all n> i. 
Note that, by combining Definitions 5 and 6, we obtain 

occ((T, w) — {y & Vars | vars{ya) n occ{a,v) 7^ }. (15) 

The following simpler characterisations for occ can be used when the variable is 
in the domain of the substitution, the substitution is variable-idempotent or the 
substitution is idempotent. 

Lemma 12 

If (7 G RSubst and v G dom{a), then occ(c7, v) = 0. 
Lemma 13 

If (7 G VSubst then, for each v G Vars, 

occ{a,v) = occi((T, w) 

= { 2/ G Vars | v G vars {y a) \ dom(cr) }. 

Lemma I4 

If cr G ISubst and v G Vars then occ(cr, v) = sg(cr, v). 
These results are proved at the end of this section. 
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Example 7 

Consider again Example 6. Then, for all i > 0, dom((Ti) = {xi,X2,Xs} so that 
occ(c7j,a;i) = occ(c7i,a;2) =occ(ai,a;3) = 0. 

However, 

OCCo(crO:-T4) = {X4}, 
OCCi((To, 3:4) = {X2,X4}, 
0002(0-0,2:4) = {xi,X2,X4}, 

occ3(ao,a;4) = {xi,X2,X3,Xi} = occ(cro, a;4). 

Also, note that 

occi((T3,a;4) = {xi,X2,X3,X4} = 000(0-3, 0:4). 
The definition of abstraction is based on the occurrence operator, occ. 

Definition 7 

(Abstraction.) The concrete domain p{RSubst) is related to SS by means of the 
abstraction function a: p{RSubst) x p{(Vars) — > SS. For each S e p{RSubst) and 
each U G pf{Vars), 

a{^,U)''^' □ a{a,U) 

where a: RSubst x pf{Vars) — > SS is defined, for each substitution a e RSubst and 
each U G pf{Vars), by 

a((7, [/) ({ occ(o-, v)nU\vG Vars } \ {0}, C/) . 

Example 8 

Let us consider Examples 6 and 7 once more. Then, assuming U = {xi,X2, X3,X4}, 

a{ao,U) = (^{occ{ao,X4)},U^ = (^{{xi,X2,X3,X4}},Uy 
As a second example, consider the substitution 

a = {xi I— »• f{xi),X2 I— >■ xi,X3 I— > a;i,a;4 1— > 2:2}. 

Then 

occ(o-, a;i) = occ(o-, 2:2) = occ(o-, 0:3) = occ(o-, 0:4) = 
so that, if we again assume U = {a;i,X2,a;3,a;4}, 

a{a,U) = {0,U). 

Any substitution in rational solved form is equivalent, with respect to any equality 
theory, to a variable-idempotent substitution having the same abstraction. 
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Theorem 3 

If T is an equality theory and cr e RSubst is satisfiable in T, then there exists a 
substitution a' £ VSubst such that t £ VSubst, for all r C cr', T h V(cr <-> ct'), 

vars{a) = vars{a') and a(CT, ?7) = a{a' , U), for any t/ G pf ( Vars). 

Equivalent substitutions in rational solved form have the same abstraction. We 
note that this property is essential for the implementation of the SS domain. 

Theorem 4 

If T is a syntactic equality theory and a, a' G RSubst are satisfiable in T and such 
that T h V(a ^ a'), then a{a, U) = a{a' , U), for any U G pf ( Vars). 



5.1 Proofs of Lemmas 11, 12, 13 and 14 and Theorems 3 and 4 

Proof of Lemma 11. 

The proof is by induction on n. For the base case (when n = 1), if occ^{a, v) ^ 0, 
then V ^ dom(cr) and occq{(t,v) — {v}. Thus, v = va so that, by Definition 5, 
V S occi(cr, w). Suppose n > 1. Then, if y £ occ„_ i (cr, ?;), we have, by Definition 5, 
vars{ya) fl occ„_2(cr, ii) ^ 0. By the induction hypothesis, 

0CC„_2(cr, w) C OCC„_i(cr, u) 

so that vars (y a) D occ„-i (cr, v) ^ and thus y € occ„ (cr, v). □ 
Proof of Lemma 12. 

By Definition 5, occo(cr, w) = and, for all n > 0, we have occ„(cr, u) = if 
occ„_i (cr,u) = 0. Thus, occ„(o-, w) = 0, for all n > 0, so that, by Definition 6, 

0cc(cr, v) =0. □ 

Proof of Lemma 13. 

Suppose first that v G dom (cr). Then 

{ 2/ e Vars | v E vars{y(T) \ dom(cr) } = 0. 

Also, by Lemma 12, occi(cr, = occ(cr, i>) = 0. 

Suppose next that v ^ dom(cr). It follows from Definition 5, that 

occo(cr, = {v}, 

occi (cr, v) = {y G Vars | vars{ya) fl {v} ^ } 
= { y e Vars | v G vars{ya) }, 

and 

0CC2(cr, ti) = I y G Vars vars(ya) n { G Vars \ v G vars{yia) } ^ | 

= { y G Vars | v G vars{ya^) }. 

However, as cr G VSubst, we have vars{ya) \ dom(cr) — vars{ya^) \ dom(cr). Thus, 
as t; ^ dom(a), occi (cr,i>) = occ2(a-, w) and hence, by Definition 5, we have also 
occ„ (cr,w) = occi (cr,w), for all n > 1. Therefore, by Definition 6, 

occ((T, v) = occi (cr, v) = {y G Vars | v G vars{ya) }. □ 
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Proof of Lemma 14- 

As (T e ISubst we have, for all y G Vars, vars{ya) \ dom(a-) = vars{ya). Also, as 
a G VSubst, we can apply Lemma 13 so that 

occ{a,v) = { y e Vars | v G vars{ya) \ dom(cr) } 
= { y e Vars | v G vars{ya) } 
= sg{a,v). 

□ 

To prove Theorem 3, we need to show that the abstraction function a is invariant 
with respect to ^-transformation. 

Lemma 15 

Let tr, a' G RSubst where a i-^* a' and i7 G pf ( Vars). Then a(cr, C/) = a(cr', [/). 
Proof 

Suppose first that a i — > a' . Thus we assume that {x ^ t),{y ^ s) & a , where 
X ^ y, and that 

<j' = {a\{y^ s])\j{y^ s[x/t]). (16) 

Suppose u G Vars. Then we show that occ(ct, w) = occ(cr',w). 

If X ^ vars{s), then a' — a and there is nothing to prove. Also, if w G dom(iT) 
then, by Theorem 1, w G dom(CT') so that by Lemma 12, occ(it, v) = occ((t', v) = 0. 

We now assume that x G vars{s) and v = va = va' . We first prove that, for each 
m > 0, 

ocCm{o;v) C occ{a' ,v). (17) 
The proof is by induction on m. By Definition 5, we have that 

occo((T, w) — occo(o-', ?;) = {?;}, 

so that (17) holds for m = 0. Suppose then that m > and that Vm G 0CCm(c7, ii). 
Then, to prove (17), we must show that Vm G occ{a',v). By Definition 5, there 
exists 

Vm-l e Vars{Vm(T) n 0CCTO_i(cr, v). (18) 

Hence, by the inductive hypothesis, Vm-i G occ(cr',f). If u„,-i G vars{vjn'j'), then, 
by Eq. (15), Vm G occ((t',u). Suppose now that ?,'„,,_i ^ 7;ar.s(?;mO"'). Since, by (18), 
we have that Vm-i G vars{vmo), it follows, using (16), that Vm = y and Wm-i = 
a;. However, by assumption, v ^ dom(o-), so that x ^ v and m > 1. Thus, by 
Definition 5, there exists 

Vm-2 G vars{xa) n occTO_2(cr, v). (19) 

However, xa = t and a; G vars{s) so that, by (19), we have ^^-2 G wars(s[a;/t]). 

Since, by Eq. (16), (y ^ s[x/t]) G c', we have also w„i_2 G vars{ya'). Moreover, 
by (19), Vm-2 G ocCm-2{o',v) SO that, by the inductive hypothesis, we have that 
'Vm-2 G occ(c7',t!). Thus, by Eq. (15), as Vm = y, Vm & occ{a',v). 
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Conversely, we now prove that, for all m > 0, 

0CCm{cr' ,v) C occ{a,v). (20) 

The proof is again by induction on m. As before, occo((t',u) = occo((t, u) — {v} so 
that (20) holds for m = 0. Suppose then that m > and Vm G 0CCm(c7', w). Then, 
to prove (20), we must show that Vm € occ{a,v). By Definition 5, there exists 

Vm-l e mrs(Wm(T') n 0CCr„_i(CT', v). (21) 

Hence, by the inductive hypothesis, Vm-i & occ{a,v). If Vm-i & vars{vmO') then, 
by Eq. (15), we have Vm € occ(cr, w). Suppose now that Vm-i ^ vars{vmO')- Since, 

by (21), we have Wm-i G vars{vm(y'), it follows, using Eq. (16), that ?,'„, = y and 
Vm-i G vars{t) = vars{xa). Hence, since Vm-i S occ{a,v), by Eq. (15), we have 
also X e occ{a,v). Furthermore, x G vars{ya) so again, by Eq. (15), as Vm = y, 

Vm G 0Cc(cr, v). 

Combining (17) and (20) we obtain the result that, if a' is obtained from a 
by a single iS-step, then occ(cr, ii) = occ{a',v). Thus, as w G Vars was arbitrary, 
a{a, U) = a{a', U). 

s s 

Suppose now that a = cti i — > ■ ■ ■ i — > an = cr' . If n = 1, then a = a' .\in > 1, we 
have by the first part of the proof that, for each i = 2, . . . , n, a((7,_i , U) = a(cr,, U), 
and hence the required result. □ 

Proof of Theorem 3. 

By Theorem 2, there exists a' e VSubst such that a i — >* a' and, for any r C a', 
T S VSubst. Moreover, by Theorem 1, vars{a) = vars{a') and T h V((T <-»■ a'). 
Thus, by Lemma 15, a{a, U) = a{a' , U). □ 

To prove Theorem 4, wc need to show that the abstraction function a is invariant 
when wc exchange equivalent variables to obtain an ordered substitution. 

Lemma 16 

Suppose a e VSubst, VjW G Vars and (v w) G a. Let p = {v w,w ^-^ v} he a. 
(circular) substitution and define a' = poa = {xp>-^tp\x>-^t£a}. Then 

1. ct' e VSubst, 

2. vars{a) = vars{a'), 

3. a(cr, [/) = a{(j' ,U), for all J7 G pf (Vars), and 

4. T h V(cr <-> a'), for any equality theory T. 

Proof 

Since cr' is obtained from a by renaming variables and a G VSubst, we have also 
that a' G VSubst. In addition, ?;ars(cr) \ = vars{a') \ {v,w} so that, since 

{v 1-^ w) E a and (w; i-^ ti) G a', we have vars{a) = vars{a'). 
To prove property 3, we have to show that, if 

a{a, U) {sh, U) and a{a' , U) {sh' , U), 

then sh = sh' . By the hypothesis, for all y G Vars we have a; G vars {y a) if and only if 
xp G vars{ya'). As a, cr' G VSubst, we can use the alternative characterisation of occ 
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given by Lemma 13 and conclude that, for each x € Vars, occ(cr, x) = occ{a' ,xp). 
Therefore sh C sh'. The reverse inclusion follows by symmetry so that sh = sh' . 
To prove property 4, we first show by induction on the depth of r G Tvars that 

T h V((t; = w) ^ (r = rp)) . (22) 

For the base case, r has depth 1. If r is a constant or a variable other than v 
or w, then r — rp. \{ r = v, then rp — w and T h V((t; = w) ^ [v = w)). 
Finally, if r = w, then rp = v and wc have, using the congruence axioms, that 
T h V((t' = w) ^ {w = v)). For the inductive step, let r = f{r\, . . . ,rn)- Then 
rp = f{r\p, . . . ,rnp)- Thus, using the inductive hypothesis, for each i = 1, . . . , n, 
T h V((w = w) [ri = rip)). Hence, by the congruence axioms, (22) holds. 

Note that (v ^ w) & a. Thus, it follows from (22) that, for each (x t) S cr, 
T h V((T — > {x = t,x = xp,t = tp}) and hence, using the congruence axioms, 
T h V((T ^ {xp = tp}). Thus, T h V(cr ^ a'). Since {w i-^ v) G a', the reverse 
implication follows by symmetry so that T h V(cr' <-> cr). □ 

Lemma 17 

Suppose (J G VSuhst. Then there exists cr' G VSuhst that is ordered such that 
vars{a) = vars{a'), a{a, U) = a{(j' , U), for all U G pf{Vars), and T h V(cr a'), 
for any equality theory T. 

Proof 

The proof is by induction on the number 6 > of the bindings {v w) & a such 
that w G param(cr) and w > v (the number of unordered bindings). For the base 
case, when 6 = 0, cr is ordered and the result holds by taking cr' = cr. 

For the inductive case, when 6 > 0, let {v t-^ w) G a he an unordered binding 
and define p = {v i-^ w,w i-^ v}. Then, by Lemma 16, we have p o a G VSuhst, 
vars{a) = vars{p o a), a{a,U) = a{p o a.U), for all U G pf{Vars), and, finally, 
T h V(cr ^ p o cr), for any equality theory T. In order to apply the inductive 
hypothesis to poa, we must show that the number of unordered bindings in p o cr is 
less than b. To this end, roughly speaking, we start showing that any ordered binding 
in a is mapped by p into another ordered binding in poa, therefore proving that the 
number of unordered bindings is not increasing. There are three cases. First, any 
ordered binding (y i— > t) G cr such that t ^ Vars is mapped by p into the binding 
{yp I— > tp) G {po (t) which is clearly ordered, since tp ^ Vars. Second, consider any 
ordered binding (y z) G cr such that z G dom(cr). Since w G param(cr), we have 
z w. If also z ^ V then we have zp = z and z G dom(p o a); otherwise z = v so 
that zp = w and, as {w i-^ v) G (p o cr), zp G dom(p o a). Thus, in either case, such 
a binding is mapped by p into the binding {yp i-^ zp) G (p o cr) which is ordered 
since zp G dom(p o a). Third, consider any ordered binding {y z) € a such that 
z G param((T) and z < y. The ordering relation implies y i= v and we also have 
y ^ w, since w G param(cr). Hence, we obtain yp = y. Now, as z G param(cr), z ^ v. 
If z ^ w, then zp = z. On the other hand, ii z = w, then zp = v so that zp < z. 
Thus, in both cases, as z < y, zp < y. and hence, {yp i—>- zp) G {poa) is ordered. 
Finally, to show that the number of unordered bindings is strictly decreasing, we 
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note that the unordered binding (y w) G a is mapped by p into the binding 
{w 1-^ v) G {po a), which is ordered. 

Therefore, by applying the inductive hypothesis, there exists a substitution a' 
such that a' G VSuhst is ordered, vars{p o a) = vars{a'), a{p o rr^U) = a{a',U), 
for ah U G pf ( Vars), and T h V(p o a a'), for any equahty theory T. Then the 
required result follows by transitivity. □ 

Proof of Theorem 4- 

By Theorem 3, we can assume that a, u' G VSubst, T h V(cr ^ a') and, for any 
U G pf( Vars), q;((t, U) = q(o"', U). By Lemma 17, wc can assume that a, a' are also 
ordered substitutions so that, by Lemma 5, dom(cr') = dom(cr). 

To prove the result we need to show that, for all v G Vars, we have both 
occ(c7, w) C occ(cr',t;) and occ(cr',t;) C occ{a,v). We just prove the first of these 
as the other case is symmetric. 

Suppose that w G Vars and that v G vars{wa) \dom.{a). Then, using the al- 
ternative characterisation of occ for variable-idempotent substitutions given by 
Lemma 13, we just have to show that v G vars{wa') \ dom(c7'). 

By Lemma 6 (replacing t by cr, a hy a' and s = t hy w = w), wc have that there 
exists z G vars{wa') \dom((T') such that v G vars{za). Thus as dom((7') — dom(tT), 
z ^ dom(cr), and hence, v = z so that v G vars{iua') \ dom(c7'), as required. □ 

6 Abstract Unification 

The operations of abstract unification together with statements of the main results 
are presented hero in three stages. In the first two stages, we consider substitutions 
containing just a single binding. For the first, it is assumed that the set of variables 
of interest is fixed so that the definition is based on the SH domain. Then, in the 
second, using the SS domain, the definition is extended to allow for the introduction 
of new variables in the binding. The final stage extends this definition further to 
deal with arbitrary substitutions. 

6.1 Abstract Operations for Sharing Sets 

The abstract unifier amgu abstracts the efi'ect of a single binding on an element of 
the SH domain. For this we need some ancillary definitions. 

Definition 8 

(Auxiliary functions.) The closure under union function (also called star-union), 
(•)* : SH SH, is given, for each sh G SH, by 

sh* {S & SG\3n>l . 35i, . . . , G s/i . 5 = U • • • U 5„ }. 

For each sh G SH and each ^ G pf ( Vars), the extraction of the relevant component 
of sh with respect to V is encoded by rel: pf{Vars) x SH SH defined as 

rel(y, sh) = {S€sh\SnV^0}. 
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For each shi, sh^ € SH, the binary union function bin: SH x SH SH is given 

by 

bin(s/ii, 5/12) =^ { Si U 52 I S*! e shi, S2 € s/12 }• 

Definition 9 

(amgu.) The function amgu: SH x Bind — > SH captures the effects of a binding 
on an SH element. Suppose x e Vars, r G Tvars, and sh G SH. Let 

A =^ rel({a;}, sh), 

B =^ rel (mr5(r), sh) . 

Then 

amgu(s/i, a; r) (s/i \ (A U S)) U bin(A*, B*). 
The following soundness result for amgu is proved in Section 6.4. 
Theorem 5 

Let T be a syntactic equality theory, {sh, U) G 55" an abstract description and 
{x ^ r},(j G RSubst such that vars{x f-^ r) U vars{<T) C U. Suppose that there 
exists a most general solution /i for ({a; — r} Li a) in T. Then 

Q^(o'j d:ss {sh, U) =^ ctip, U) <ss (amgu(,s/i, x ^ r),U). 

The following theorems, proved in Section 6.4, show that amgu is idempotent 
and commutative. 

Theorem 6 

Let sh G SH and (a; 1— > r) G Bmc?. Then 

amgu(s/i, a; r) = amgu ( amgu (s/i, a; r), a; r) . 

Theorem 7 

Let s/i G 5-// and {x ^-^ r),{y ^ t) G Bind. Then 

amgu ( amgu (s/i, x ^ r),y ^ t) = amgu ( amgu (s/i, y ^ t),x ^ r). 

6.2 Abstract Operations for Sharing Domains 

The definitions and results of Section 6.1 can be lifted to apply to the proper set- 
sharing domain. 

Definition 10 

(Amgu.) The operation Amgu: SS x Bind SS extends the SS description it 
takes as an argument to the set of variables occurring in the binding it is given as 
the second argument. Then it applies amgu. Formally: 

U' =^ vars{x 1— > r) \ [/, 

Amgu((s/i, U), xi-^r)=^ ^amgu(^s/i U { {u} | m G J7' }, a; 1-^ , [/ U J/'^ . 

The results for amgu can easily be extended to apply to Amgu giving us the 
following corollaries. 
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Corollary 1 

Let T be a syntactic equality theory, {sh, U) G SS and {x r}, cr € RSubst such 
that vars{a) C U. Suppose there exists a most general solution n for ({a; = r} U cr) 
in T. Then 

a{a,U) diss {sh,U) => a(/x, C/ U t;ars(a: r)) Anigu((s/i, L''),x i— »• r). 

Corollary 2 

Let sh G 577 and (a; i— > r) G Bind. Then 

Amgu (( s/i, H- > r) = Amgu^Amgu((s/t, L''),a; r) , a; h- > . 
Corollary 3 

Let s/i G ^if and (a; i-^ r), (y h-> G Bind. Then 
Amgu^Amgu((s/i, ?7), a; i— > r) , y 

= Amgu^Amgu((s/i, U),y ^ t),x ^ . 

6.3 Abstract Unifiers for Sharing 

We now extend the above definitions and results for a single binding to any substi- 
tution. 

Definition 11 

(aunify.) The function aunify: SS x RSubst SS generalizes Amgu to any sub- 
stitution n G RSubst in the context of some syntactic equality theory T: If we have 
{sh, U) G SS, then 

aunify((s/i, U), 0) {sh, U); 
if fi is satisfiable in T and (x t-^ r) G ^, 

aunify((s/i, U), jj) ^= aunify ^( Amgu (s/i, C/), a; i— > r) , \ {x r}^ ; 
and, if /U is not satisfiable in T, 

aunify((s/i, U),fj,) =^ ±. 

For the distinguished elements _L and T of SS, 

aunify(_L, /x) =^ _L, 
aunify(T, fj,) =^ T. 
As a result of Corollary 3, Amgu and aunify commute. 

Lemma 18 

Let {sh, U) e SS,!/ e RSubst and {y^t)& Bind. Then 

aunify ^ Amgu (( s/i, f/), y i— > = Amgu ^aunify (( s/i, ?7), z/), y t^. 
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As a consequence of this and Corollaries 1, 2 and 3, we have the following soundness, 
idempotence and commutativity results required for aunify to be sound and well- 
defined. 

Theorem 8 

Let T be a syntactic equality theory, {sh, U) € SS and a,u € RSubst such that 
vars{a) C U. Suppose also that there exists a most general solution for U a) 
in T. Then 

oi{a,U) :<ss {sh,U) =^ a(fi,U U vars{u)) :<ss aunify((s/i, ?7),/i). 

This theorem shows also that it is safe for the analyzer to perform part or all of 
the concrete unification algorithm before computing aunify. 

Theorem 9 

Let (sh, U) e SS and e RSubst. Then 

aunify((s/i, U),v) = aunify ^aunify ((s/i, U),v),v^. 

Theorem 10 

Let {sh, U) G SS and 1^1,1^2 6 RSubst. Then 

aunify ^aunify ((,s/i, U), 1^1), = aunify ^aunify ((s/i, U), 1^2) , i^ij ■ 
The proofs of all these results are in Section 6.5. 

6.4 Proofs of Results for Sharing-Sets 

In the proofs we use the fact that (•)* and rel are monotonic so that 

shi C sh2 shl C s/i*, (23) 

shi C sh2 =^ vel{shi,U) C vel{sh2,U). (24) 

We will also use the fact that (•)* is idempotent. 

Let i 1 , . . . , i„ be terms. For the sake of brevity we will use the notation vti-. to 
denote UiLi vo-fsiti). In particular, if x and y are variables, and r and t are terms, 
we will use the following definitions: 

dcf f def r -I 

Vx = Vy = {y\, 

dof , X def . N 

Vr = vars{r), Vt = vars(t), 

def def . . 

Vxr = Vx^ Vr, Vyt = Vy U Vf. 

Definition 12 

(rel.) Suppose V G pi{Vars) and sh G SH. Then 

rel(y,s/i) =^ s/i\rel(y,.s/i). 

Notice that if 5* € reI(F, sh) then Sr\V = 0. Conversely, ii S G sh and Sr\V = 
then 5 e rel(V', s/i). The following definition of amgu is clearly equivalent to the 
one given in Definition 9: for each variable x, each term r, and each sh e SH, 

amgu(s/i, a; r) ve\{vxr-iSh)\J\>m{ve\{vx,sh)*,ve\{vr,sh)*). (25) 
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Proof of Theorem 5. 

We first prove the result under the assumption that a{a, U) = {sh, U). We do this 
in two parts. In the first, we partition a into two substitutions one of which, called 
a~ , is the same as a when a and /i are idempotent. Wc construct a new substitution 
V which, in the case that a and ji are idempotent, is a most general solution for 
xa = ra. Finally we compose u with a~ to define a substitution that has the same 
abstraction as n but with a number of useful properties including that of variable- 
idempotence. In the second part, we use this composed substitution in place of ii 
to prove the result. 

Part 1. By Theorem 3, we can assume that 

cr G VSubst (26) 

and that all subsets of a are in VSubst. Let a°,a~ G RSubst be defined such that 

a~Ua° = cr, (27) 
dom((7°) = dom(cr) n |J vars{xa' = ra'), (28) 

i>l 

dom(cr") n dom(cr°) = 0. (29) 
Then, it follows from the above assumption on subsets of cr that 

(J- e VSubst, a° e VSubst. (30) 

Now, suppose z G vars{<T°) \ dom(o-°). Then z G vars(ya°) for some y G dom(a-°). 

Thus, by (28), for some j > 2, z € vars{xa^ = ra^) \ dom(o-°) and, again by (28), 
z ^ dom(cr) so that, by (26), z e vars{xa = ra). Therefore, as z was an arbitrary 
variable in vars{a°) \ dom(cr°), 

vars{a°) C {yars{xa = ra) Udom(cr°)). (31) 

It follows from (28) that dom(cr) n vars{xa = ra) C dom(cr°) so that, by (29) 

dom(cr~) n vars{xa = ra) = 0. (32) 

Hence, by (29) and (31), we have 

dom(cr") n vars{a°) = 0. (33) 

Let v G RSubst be a most general solution for {xa = ra} U (7° in T so that 

rhV(j/^{xcr = rcr}Ucr°), (34) 
vars{v) C i^vars{xa = ra) U vars{a°)). (35) 

By Theorem 3, we can assume that 

V G VSubst. (36) 

By (32), (33), and (35), we have 

dom(cr~) n vars{v) = 0. (37) 
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Therefore, as cr , S VSubst (by (30) and (36)), we can use Lemma 7 to obtain 
the foUowing properties for v o a~ . 

ThV((i^ocr-) ^ (i^UCT^)), (38) 
dom(i/ o cr~) = dom(z/ U cr~), (39) 
o £7" e VSubst. (40) 

Now we have 

T h V(/x ^ {x = r} U a) 

[by hypothesis] 
T h V(/i {a;(T = m} U cr) 

[by Lemma 2 and the congruence axioms] 

T h V(/i U £7") 

[by (27) and (34)] 
Th\f{n^uoa-) (41) 
[by (38)]. 

Therefore, by Theorem 4, 

a(/x,Z7) = a(z/ocr-,i7). (42) 

Part 2. To prove the result under the assumption that a{a,U) = {sh,U), we 
define sh' G SH so that 

a{/j,U) = {sh',U). (43) 

Then, by (42), a{i' o (t^,U) = {sh',U). Wc show that C amgu(s/i,.x ^ r). If 
sh' = 0, there is nothing to prove. Therefore, we assume that there exists S G sh' 
so that 5^0 and, for some v G Vars, 

V ^ dom(t/ o (T~), (44) 

5* =^ occ(:/ o (7~, w). (45) 

Note that (39) and (44) imply that 

V ^ dom(i/), V ^ dom(a~). (46) 



Let 



We show that 



S' \J{ occ(c7, y)\yG occ(i/, i;) }. (47) 



S = S'. (48) 

By (26), (36) and (40), cr, z/ofj" G VStifesi and, by (44) and (46), v ^ dom(z/oc7-) 

and V ^ dom(i^). Thus, it follows from Lemma 13 with (45) and (47), that it 
suffices to show that, for each w G Vars, v G vars{wa~ v) if and only if there exists 
z G vars{wa) \ dom(cr) such that v G vars{zv). 
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First, we suppose that v G vars{wa v). Thus, there exists y e vars{wa ) such 
that V G vars{yv). Since (j°,v G VSubst (by (30) and (36)), T \- ^ a°) 
(by (34)), V ^ dom(i/) (by (46)) and T h V(i^ {yp = y)) (using Lemma 2), 
we can apply Lemma 6 (replacing t hy a by (T° and s ~ t hy yv = y) so 
that there exists z G vars{ya°) \ dom(c7°) such that v G vars{zi'). We want to 
show that z G vars{wa) \ dom(cr). Now either z G dom(i/) or 2; = w so that, by 
(37) (if z G dom(z^)) or (46) (if z = v), z ^ dom(cr~). However, z ^ dom(cr°), so 
that, by (27), z ^ dom(o-). Thus, it remains to prove that z G vars{wa). Now, as 
y G vars{wa^) and z G vars{ya°), we have z £ vars{wa~a°). So we must show 
that vars{wa~a°) \ dom(cr) C vars{wa). To see this note that, if w ^ dom(cr~), 
then wa~ = w and, by (27), w<t° = wa so that wa~a° = wa. On the other 
hand, if w G dom(a^), then, by (27), 'wa~ = wa so that wa~a° = waa° Now, 
as a £ VSubst and a° C a (by (26) and (27)), we can apply Lemma 4 so that 
vars{'waa°) \ dom(CT) C vars{wa). Hence, vars{u!a-^ a") \ dom(cr) C vars{wa). 

Secondly, suppose there exists z G vars{wa) \ dom((T) such that v G vars{zh'). 
Then v G vars{wuv). We need to show that v G vars{wu~v). By Eq. (27), if 
w G dom(cr~), then wa~u = wav so that v G vars{wa~u). On the other hand, if 
w (f. dom((7~), then again, by (27), v G vars{wa°v). Moreover, w = wa~ so that, 
by (34) and Lemma 2 with the congruence axioms, T h — > (wa°i' = wa~)). 
Hence, since u G VSubst (by (36)) and v ^ dom(z/) (by (46)), we can apply Lemma 6 
(replacing t hy a by the empty substitution and s = t hy wa°u = wa~) and 
obtain v G vars{wa~ v). 

Therefore, as a consequence of the previous two paragraphs, for each w G Vars, 
we have v G vars{wa~u) if and only if there exists z G vars{wa)\dom{a) such that 
V G vars{zu). It therefore follows that Eq. (48) holds. 

Let 

Sx'= [j(^{occ{a,y) I y€occ{u,v)}nrel{vx,sh)^, (49) 
Sr '= ^{ occ(cr, y) \ y €: occ{u, v)} D rel{vr, shfj , (50) 
So =^ U (^{ occ(cr, 2/) I y G occ(z/, w) } n iel{vxr, sh)^ . (51) 
Note that by (47), (48) and the fact that 

rel(wa;r, sh) = sh\ (Tel{vx, sh) U rel(wr, sh)), 

we have 

So^S\{SxUSr). (52) 

We now consider the two cases 5o / and 5o = separately. 

Consider first the case when So ^ 0- Then, by (51), for some y G Vars, 

y G occ{u, v), (53) 
occ((T, y) G rcl(w2:r, s^)- (54) 

Thus, by Lemma 12, y ^ dom{a) and hence, by (27), y ^ dom((7°). Also, by (54), 
occ(a, y) n Vxr = 0- Thus as cr G VSubst (by (26)) we can use Lemma 13 to see 
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that, for each w G v^r, y ^ vars(wa) and hence, y ^ vars{xa = ra). Therefore, 
by (31) and (35), y ^ vars{v). As € VSubst (by (36)), we can apply Lemma 13 
to both 000(1/, y) and occ(i/, t;). Thus, as y ^ vars{v), oco{v,y) = {y} and also 
(using (53)) V = y so that occ{v,v) = {v}. It therefore follows from (47) and (48) 
that S = occ((7, v) and hence from (54) , that 

S eTei{v„.,sh). (55) 

Now consider the case when 5*0 = 0. By (52), and the assumption that S ^ 0, 

S = Sa^U Sr ^ 0. (56) 

As a consequence of (49) and (50), 

G rel(wa,,s/i)* U0, (57) 
Sr &Tel{vr,sh)* U0. (58) 

Now, by (56) either Sx ^ or Sr ^ 0- We will show that both Sx ^ and 
Sr ^ 0. Suppose first that Sx ^ 0. Then, by (57), x £ S^. Hence, by (56), x & S. 
By (45), X G occ(j^o(7^, f). However, vocr~ G VSubst (by (40)) so that we can apply 
Lemma 13 to ooc{v o a~ , v) and obtain that v G vars{xa~ v). By the definition of n 
in the hypothesis and (41), T h \/(y o a~ ^ {x = r)) and hence, by Lemma 2 with 
the congruence axioms, T h V(j^ o ct" [xa~v = r)). Thus, as o a~ G VSubst 
(by (40)) and w ^ dom(j/ o a~) (by (44)), we have, by Lemma 6 (replacing t by 
v o a~, a hy the empty substitution and s = t hy X(j~v = r), v € vars{ra~v). By 
re-applying Lemma 13 to ooo{uoa~ ,v), it can be seen that, as w ^ dom(i/) (by (44)), 
Vr n occ(j/ o (J-, i;) =^ 0. Hence, by (45), SC]Vr ^ 0. Thus, by (47) and (48), there 
exists a y G ooo{v, v) such that occ((7, y)f^Vr^ 0- Therefore, by (50), SrCWr ^ 
and so Sr 0- Secondly, by a similar argument, if Sr then we have Sx ^ 0- 
Hence Sx ^ and 5^ 7^ 0. So that, by (57) and (58), Sx & j:el{vx,sh)* and 
Sr G rel(fr, s/i)*. Therefore, we have, by (56), 

S G bin(rel(ux, sh)*, T:el{vr, sh)*) . (59) 
Combining (55) when Sq ^ and (59) when 5o = we obtain 

S G ve\{vxr, sh) U bin(rel(t;x, sh)*, rel{vr, sh)*) 
and therefore, by (25), 

S G a.mgu{sh,x 1— > r). 

As a consequence, since S was any set in sh' , we have sh' C amgu(s/i, a; r) and 
hence, by (43), 

Q:(M) f^) ^ss (amgu(s/i, X r),U). (60) 

We now drop the assumption that a{a, U) = {sh, U) and just assume the hy- 
pothesis of the theorem that a{a,U) :<ss {sh,U). Suppose a{a,U) = (s/ii,C/). 
Then shi C sh. It follows from Definition 9 that amgu is monotonic on its first 
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argument so that 

amgu(s/ii,a; i— > r) C a,mgu{sh,x r). 
Thus, by (60) (replacing sh by shi), we obtain the required result 
Q!(Mj U) :<ss (amgu(5/i, x r),U) . □ 

Lemma 19 

For each sh\, s/i2 € we have 

bin(s/ii, s/12)* = bin(5/i^, 5/12)- 

Proof 

Suppose S G 5'C?. Then S G bin(,s/ii, s/12)* means that, for some n G N, there exist 
setsi?!,... ,i?„ G shi andTi,... ,T„ G s/12 such that 5= (i?iUTi)U- • ■U(i?„UT„). 
Thus 5 = (i?i U • • • U i?„) U (Ti U • • • U r„). However iii U • • • U ii„ e s/i^ and 
Ti U • • • U T„ G shi. Thus 5 G bin(s/i*, s/;,2). 

On the other hand, S G bin(s/i^, sh^) means that S = R U T where, for some 
k,l £ N, R-i, . . . ,Rk & shi, and Ti, ... , T; e s/12, we have R = RiU ■ ■ - D Rk and 
T = Ti U- • -UT;. Let n be the maximum of {k, 1} and suppose that, for each i, j G N 

def def 

where k + 1 < i < n and / + 1 < j < n, we define Ri = Rk and Tj = Tj. Then, 
S = (i?i U Ti) U • • • U (i?„ U T„). However, for 1 < i < n, i?i U Tj G bin(s/ii, s/i2). 
Thus 5 G bin(s/ii, s/12)*. □ 

Proof of Theorem 6. 
Let 

s/i_ =^ rel(uxr, s/i), 



s/^xr =^ bin(rel(wa;, sh)* ,ve\{vr, sh)*). 



Then, by Lemma 19, 



Moreover, 

rel(wx, shxr) = sh^r, rel(wx, s/i_) = 0, 

rel(t;r, s/ia;r) = shxr, rel{vr, sh-) = 0, 

j:el{vxr, shxr) = 0, ^1(1)3;^, sh-) = sh- 

Hence, we have 

rel{vx, sh- U sh^r) = shxr, 

Tel{Vr, sh- U shxr) = shxr, 

rel(Ua;r, sh- U shxr) = •s/i-. 
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Now, by (25), 

anigu(amgu(s/i, a; i-^ r) , a; i-^ r) 

= re\{vxr, sh- U shxr) U bin(rel(i'a;, sh- U shxr)* ,i^e^{vr, sh- U shxr)*) 

= sh- U shxr 

= amgu(,s/i, .X r). □ 

For the proof of commutativity, we require the foUowing auxiliary results. 

Lemma 20 

For each V € pi{ Vars) and sh G SH wc have 

Te\{V,sh*) = Te\{V,shy. 

Proof 

Let S e SG. Then S € rel(V', sh*) means S* G s/i* and 5 n F = 0. In other words, 
there exist S\, . . . , Sn & sh such that S = Ur=i '''^^i fo'" ^^"^^ * = 1) • • • , n, we 
have S'i n y = 0. This amounts to saying that there exist S'l, . . . , S'„ S rel(y, sh) 
such that S = Ur=i "^^i which is equivalent to 5 G rel(y, sh)*. □ 

The auxiliary function rel possesses a weaker property. 

Lemma 21 

For each V G pi{Vars) and sh G iS^? we have 

rel(y, sh*) D rel(y, sh)*. 

Proof 

Let S' G 5(7. Then S* G rcl(y, sh)* means that there exist S'l, . . . , 5„ G s/i such 
that SiCiV ^ 0, for each i = 1, . . . , n, and S* = [JlLi -S'l- Thus S nV ^ and 
S G rel(T/, sh*). Hence, rel(y, s/i*) 3 rel(F, sh)*. □ 

Lemma 22 

For each F G pf( Vars), shi, s/i2 G Si/, and S G pf( Vars) we have 

S* G rel(F, s/ii U 5/12)* U {0} 

351 G rel(y, shi)* U {0} . 35*2 G rel(y, 5/12)* U {0} . 5 = S'l U S'2. 

Proof 

If S = the statement is trivial. 

Suppose S G rel(y, shi U sh^)*- Then, for some n G N, there exists n sets 
R\, . . . , i?„ G {shi U s/12) such that Ri r\V 7^ for each z = 1, . . . , n, and 
S = ljr=i Suppose Sj ~ U{ i?i G shj | 1 < i < n } for j = 1, 2. Thus we have 
S'l G rel(y, shi)* U {0}, S'2 G rel(V, 5/12)* U {0}, and S' = S'l U S'2. 

Suppose 

35i G rel(F, shi)* U {0} . 3S'2 G rel(y, s/12)* U {0} . 5 = 5i U S'2, 
with S'l and S'2 not both empty. Then, for some m > and n > 0, there exist 
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Ri, . . . , Rm G iel{V, shi) and Ti, . . . , r„ e rel(F, .s/12) such that 5*1 = Ul^i 
S2 = Ur=i ^i- Then Ri, . . . , Rm,Ti, . . . ,Tn e re\{V, shi U s/12) and 

m n 

5=(Uiii)u(Ur,). 

i=l i=l 

Thus S G rel(y, s/n U 5/12)*- □ 
Lemma 23 

For each Vi, V2 € pf ( Vars) and sh G S/f we have 

ve\{Vi,^\{V2, sh)) = ?d(y2,rel(yi, sh)). 

Proof 

Suppose S e SG. Then S e rcl(Vi, reI(V2, sh)) means 5 fl 14 and 5 n ^2 = 0- 
Similarly, S" G reI(F2, rel(Fi, 5/1)) means that S nV2 = a,nd S CiVij^ 0. □ 

Proof of Theorem 7. 

We let i?, S, T, and f/ (possibly subscripted) denote elements of sh*. The subscripts 

reflect certain properties of the sets. In particular, subscripts x, r, xr, y, t, yt indicate 
sets of variables that definitely have a variable in common with the subscripted set. 
For example, Rx is a set in sh* that has a common element with Vx and Txt is a 
set in sh* that has common elements with Vx and Vt. In contrast, the subscript '— ' 
indicates that the subscripted set does not share with one of the sets Vxr or Vyt. Of 
course, in the proof, each set is formally defined as needed. 
Suppose that 

S G amgu(amgu(s/i, a; ^ r),y ^ t) . 

We will show that 

S G amgu(amgu(s/i,y 1— > t),a; 1— > r). 

The converse then holds by simply exchanging x and y, and r and t. 

There are two cases due to the two components of the definition of amgu in 
Eq. (25). 

Case 1. Assume 

S G rel(wyt, amgu(s/i, x ^ r)). 
Then S G amgu(s/i, r) and SOVyt = 0. Again there are two possibilities. 

Subcase la. Suppose first that 

S G T:e\{vxr, sh). 
Thus S € sh, and, since in this case we have S CiVyt = 0, 

S G rel{vyt, sh). 
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The alternative definition of amgu, (25), implies rel(vyf , sh) C amgu(s/i, y ^ t) and 
thus we have also 

S e anigu(s/i, y ^ t). 
Now, since the hypothesis of this subcase implies S fl Vxr = 0, we obtain 

S e rel(wxr J amgu(s/i, y i— > t)) . 
Hence, again by (25), we can conclude that 

S e amgu ( amgu (s/i, y ^ t),x ^ r). 

Subcase lb. Suppose now that 

S e bin(rel(t;x, sh)* ,rel{vr, sh)*). 

Then, there exist Sx,Sr G SG such that S = SxU Sr, where 

Sx € Tel{vx, sh)*, Sr € iel{vr, sh)*. 

By the hypothesis for this case we have S DVyt = and thus Sx Ci Vyt = and 
SrCWyt = 0. This allows to state that 

Sx e Tel{vyt,re\{vx, sh)*), Sr G iel{vyt,r:el{vr, sh)*), 
and hence, by Lemma 20, 

Sx e ie\(vyt,vel{vx, sh))* , Sr € rel{vyt,re\{vr, sh))* , 
Thus, by Lemma 23, 

Sx G Tel{vx,rel{vyt, sh))* , Sr G Tel{vr,J:el{vyt,sh))* , 

so that, by (23), (24), and (25), 

Sx 6 rel(t'a;, amgu(s/i, 2/ '-^ € Tel(yr,amgu{sh,y i— > t))* . 

Therefore. 

SxU Sr G bin^rel(t;x, amgu(s/i, y i— > t))* ,ve[[vr,am.ga{sh,y i—>- 1))*^ 
so that, as Sx USr = S, it follows from (25) that 

S G amgu(amgu(s/i, y t-^ t),x i-^ r). 

Case 2. Assume 

S G bin^rel(%,amgu(s/i,a; i—>- r))*,rel(wt,amgu(s/i, a; i— > r))*^. 
Then there exist Sy,St G SG such that 

S = SyUSt (61) 
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where 

Sy G rel(wy, amgu(s/i, a; i-^r))*, 
St € vel(vt, amgu{sh, X i— *■ r))*. 

Then, by Lemma 21, 

By (25) and Lemma 22, there exist R^r, TL, and T^r such that 

Sy = R- U Rxn St = TL U Txr 

where 

-R_ G rel(z;y, rel(wa;r, s/i))"* U {0}, 
-R^r £ rel^Wy, bin(rel(w2,, s/i)*, rcl(?;j., sh)*)^ U {0}, 
T_ e rel(wt,rel(wa;r, s/i))* U {0}, 
T^r € rel^ti(,bin(rel(tia;,s/i)*,rel(tir,s/i)*)j U {0}. 

Then, by Lemmas 23 and 20, 

R- e v^{vxr, rel(wy , s/i)*) U {0}, 
T_ e ?il(t;xr,rel(t;t, sh)*) U {0}. 

Also, using Lemmas 21, 19, and then the idempotence of (■)*, 

Rxr €: rel^Vy,hm(rel{vx, sh)* ,rel{vr, sh)*)^ U {0}, 

Txr e rel^Wt,bin(rel(wx,s/i)*,rel(wr,s/i)*)) U {0}. 

Subcase 2a. Suppose Rxr = Txr = 0- Then, by (64), 

Sy — R-, St = TL. 
By (63), R-,T- ^ and hence, using (66), 

ii_ U T_ e bin(rel(t;y , sh)*, rel{vt, sh)*) , 

so that, by (25), 

U T_ G amgu(s/i, y^t). 

Also, it follows from (66) that i?_ fl Vxr = and T_ n Vxr = 0, so that 

ii_ U T_ G rel(wxr J amgu(s/i, y t)). 
However, by (61) and (68), S = R-UT- so that, by (25), 
S G amgu(amgu(s/i, y t),x r). 
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Subcase 2b. Suppose Rxr U T^r ^ 0. Then, by (67), 

{R^r U T^r) n Vyt ^ 0. (69) 

The proof of this subcase is in two parts. In the first part we divide R^r and 
into a number of subsets. In the second part, these subsets wiU be reassembled so 
as to prove the required result. 

First, by (67), there exist Rx,Rr,Tx,Tr G pf{Vars) such that 

Rxr — Rx ^ Rr^ — ^x ^ -^rj ('^^) 

where either R^ = Rr = or 

Rx G rel(ua;, sh)*, Rr G Te\{vr, shy, 

and either T^; = = or 

Tx G Te\{va:, sh)*, Tr G rcl(?;,., sh)*. 

Thus, if either R^UT^ = or RrUTr = 0, it follows that 

Rxr U T^r = {Rx U Rr) U (T^ U Tr) = 0. 

However, by (69), Rxr U Txr 7^ 0, so that we have 

RxUTx^ 0, RrUTrJ^ 0. (71) 

We now subdivide the sets Rx, Tx, Rr, and Tr further. First note that 

sh = Tel{vyt, sh) U ve\{vy, sh) U rel[vy, rel(wt, sh)') , 

sh = Tel{vyt, sh) U rel(t;t, rel(%, sh)) U Tel{vt, sh). 

Hence, by Lemma 22, sets Rx-, Rxy, Rxti Rr-i Rry, Rrti Tx-, Txy, Txt, Tr-, Try, 
Trt € p{{Vars) exist such that 

Rx = Rx- U Rxy U Rxt, Tx = Tx- U Txy U Txt, 

Rr — Rr— U Rry U Rrt-i -^r ~ Tr— U Try U T^^, 

where 

Rx-,Tx- G rel(i;a;,rel(wj;t, s/i))* U {0}, 
Rr-,Tr- G rel(i;r,rel(i'yt, s/i))* U {0}, 

and 

Rxy, Txy G rcl(?;j;,rel(wj^, s/i))* U {0}, 
Rry, Try G rcl('LV, rcl(uj,, s/;,))* U {0}, 
Rxt, Txt G rel(u:r,rcl(wf, s/i))* U {0}, 
Rrt,Trt G rel(wr,rel(ut,s/i))* U {0}, 

and also 

{Rx \ Rxy) nvy = 0, {Tx \ Txt) nvt = 0, 

{Rr \ Rry) nVy = 0, {Tr \ Trt) Cl Vt = . 



(72) 
(73) 

(74) 
(75) 
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We note a few simple but useful consequences of these definitions. First, it follows 
from (73) using (23), (24), and (25), that 

Rx-,T^- ercl(w^,amgu(s/i,yH^i))*U{0}, ^^^^ 
Rr-,Tr- E rcl(ur, amgu(s/i, y i-^ t)) U {0}. 

Secondly, using (73) with Lemma 21, we have 

i?,_,r,_,i?,_,T,_ €Tel{vyt,sh)*U{0}, (77) 

and then, using this with (69), (70), and (72), it follows that 

Rxy U T^y U Rry U Try U Rxt U Txt U Rrt U Tj-t ^ 0. (78) 

In the second part of the proof for this subcase, the component subsets of S are 
reassembled in an order that proves the required result. First, let 

Uy R— U Rxy U Rry U Txy U Try-j (79) 

Ut =^ r_ U R^t U Rrt U T^t U Trt, 



and 



[/ UyVjUf (80) 



By relations (65) and (74) (with Lemma 21), each component set in the definition 
of Uy is in rel(vy, sh)* U {0} and each component set in the definition of Ut is in 
T:e\{vt, sh)* U {0}. Thus, by the definition of (•)*, 

Uy e ve\{vy,shy U {0}, 

Ut e rel(wt, s/i)* U {0}. (81) 
By (70) and (75) we have 

[Rxr \ {Rxy U Rry)) H Vy = 

and hence, by (64), we have also that 

{Sy \ {Rxy U Rry liR-.))nVy = 0. 

By (63), Sy CiVy 0. Thus, R^y U Rry U i?_ 7^ and, as a consequence of (79), 
Uy 7^ 0. For similar reasons, Ut ^ 0. Hence, by (80), 

[/ e bin(rel(wy, s/i)*,rel(i'f, s/i)*), 

and therefore, using (25), it follows that 

U e amgu(s/i, yi-^t). (82) 
Now, by (78), at least one of the following two inequalities holds: 

Rxy U Txy U Rxt U Txt ^ 0, 

Rry U T'^y U -Rr-f U Trt ^ ^* 
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Assume first that R^y U T^y U ii^t U T^* = and Rry U Try U Rrt U T^t 7^ 0. Then, 
using (71) and (72) with the first of these, 

R^- U ^ 0. 

Also, using (74) with the second, we have {Rry U Rrt U T^j, U Trt) f) Vr ^ and 
therefore it follows from (79) and (80), that 

unvr^0. 

Hence, by (76) and (82), 

Rx-OTj;- ercl(vx,a.ingu{sh,yi-^t))*, 

* (84) 
U U Rr- U T,._ e rcl(w,., amgu(s/i, y t)) . 

Similarly, assuming R^y U T^jy U Rxt U Tl^t and ii^y U ^r!/ U -Rrt U ^rt = it 
follows that 

Rr- U Tr_ G rel(ur, amgu(s/i, y 1— »• t))*, 
-Rx- U Tx- UU € rel(ux, amgu(5/i, y 1— > f)) . 

Finally, assuming Rxy U Txy U i?a;t U TIct ^ and U Try U i?rt U Trt 7^ it follows 
from (74) that U f\Vx ^ and U f\Vr ^ 0, and hence 

Rx- U Tj;- U i7 G rel(t)a;, amgu(,s/i, y 1-^ i))*, 
J7 Rr- yjTr- S rel(fr, amgu(,s/i, y t)Y ■ 
Thus, as one of the inequalities in (83) holds, one of (84), (85) or (86) holds so that 

Rx- U U [/ U Rr- U Tr- 

G bin^rel(t;x,amgu(5/i,y 1— »• t))*,rel(t;r,amgu(5/i,y *))*)• 

However, since 

S = Rx- U U [/ U Rr- U Tr-, 

we have 

S £ bin^rel(t;a;, amgu(s/i, y ^ t)) , rel(t'r, amgu(s/i, y t)) 
Hence, by (25), 

S e amgu(amgu(s/i, y 1— > f), a; 1— > r) . □ 



6. 5 Proofs of Results for Sharing Domains 

We prove all the results in this section by induction on the cardinality of a substi- 
tution ;y. For each result, the proof is obvious if ly is empty or docs not miify. Thus, 
in the following proofs, we assume that v unifies and is non-empty. We suppose 
that {x 1-^ r) £ u and let v' = u\{x r}. 
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Proof of Lemma 18. 
We have 



aunify^Amgu((sft, U),y ^ t),v^ 

= aunify^Amgu^Anigu((s/i, U),y t) , x rj , v'^ [Def. 11] 

= aunify ^Amgu^Aingu((s/i, U), a; i— > r) , y i— > t j , i/'^ [Cor. 3] 

= Amgu^aunify^Amgu((s/i, U) , x r) , v'^ , y [induction] 

= Amgu(^aunify((s/i, U), u),y<-^tj [Def. 11]. □ 

Proof of Theorem 8. 

Let /i' be a most general solution for (y' U a). Then 



a{a, U) <ss {sh, U) 

=> ai^^' ,U U vars{u')^ 

:<ss aunify((s/i, U), v') 
=4> q;(/x, U U vars{i')) 

:<ss Amgu^aunify((s/i, a; 

a(/u, U U vars{i')) 

<ss aunify ^Amgu((s/i, U), x r), u'^ [Lem. 18] 

=> a(n,U U vars{v)) 

diss aunify((s/i, U),i^) 



Proof of Theorem 9. 
We have 



[induction] 
[Cor. 1] 



[Def. 11]. □ 



aunify^aunify((s/i, U), v),!^ 

= aunify ^Amgu^aunify(Amgu((s/i, U), x i— > r), v'),x^ r^, v'^ [Def. 11] 
= aunify ^aunify ^Anigu(Amgu((s/t, U), a; i— *■ r), a; i— » r) , v'^ , v'^ [Lem. 18] 

= aunify ^Amgu(Amgu((s/i, U) , x t-^ r) , x i—>- r) , u'^ [induction] 
lify ^Amgu((s/i, U),x t-^ r), ly'^ 



auni: 



aunify((s/i, U)^u) 



[Cor. 2] 
[Def. 11]. □ 



Proof of Theorem 10. 

The induction is on the set of equations i^i. The comments at the start of this 



section apply therefore to ui instead of v and thus we let u[ =^ {x i—>- r} so 
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that we have 



aunil 



ify ( aunify {{sh,U),Pi) ,1^2 




[Def. 11] 



[induction] 



[Lem. 18] 



[Def. 11]. □ 



7 Conclusion 



The Sharing domain, which was defined in (Jacobs and Langen 1989, Langen 1990), 
is considered to be the principal abstract domain for sharing analysis of logic pro- 
grams in both practical work and theoretical study. For many years, this domain 
was accepted and implemented as it was. However, in (Bagnara ct al. 1997), wc 
proved that Sharing is, in fact, redundant for pair-sharing and we identified the 
weakest abstraction of Sharing that can capture pair-sharing with the same degree 
of precision. One notable advantage of this abstraction is that the costly star-union 
operator is no longer necessary. The question of whether the abstract operations 
for Sharing were complete or optimal was studied by Cortesi and File (Cortesi and 
File 1999). Here it is proved that although the 'U' and projection operations are 
complete (and hence, optimal), aunify is optimal but not complete. The problem 
of scalabiUty of Sharing, still retaining as much precision as possible, was tackled in 
(Zaffanella. Bagnara and Hill 1999a), where a family of widenings is presented that 
allow the desired goal to bo achieved. In (Zaffanella, Hill and Bagnara 1999b, Zaf- 
fanella, Hill and Bagnara 2001), the decomposition of Sharing and its non-redimdant 
counterpart via complementation is studied. This shows the close relationship be- 
tween these domains and PS (the usual domain for pair-sharing) and Def (the 
domain of definite Boolean functions). Many sharing analysis techniques and/or 
enhancements have been advocated to have potential for improving the precision 
of the sharing information over and above that obtainable using the classical com- 
bination of Sharing with the usual domains for linearity and freeness. Moreover, 
these enhancements had been circulating for years without an adequate support- 
ing experimental evaluation. Thus we investigated these techniques to see if and 
by how much they could improve precision. Using the China analyzer (Bagnara 
1997) for the experimental part of the work, we discovered that, apart from the 
enhancement that upgrades Sharing with structural information, these techniques 
had little impact on precision (Bagnara, Zaffanella and Hill 2000). 

In this paper, we have defined a new abstraction function mapping a set of sub- 
stitutions in rational solved form into their corresponding sharing abstraction. The 
new function is a generalisation of the classical abstraction function of (Jacobs and 
Langen 1989), which was defined for idempotent substitutions only. Using our new 
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abstraction function, we have proved the soundness of the classical abstract uni- 
fication operator aunify. Other contributions of our work are the formal proofs of 
the commutativity and idempotence of the aunify operator on the Sharing domain. 
Even if commutativity was a known property, the corresponding proof in (Langen 
1990) was not satisfactory. As far as idempotence is concerned, our result differs 
from that given in (Langen 1990), which was based on a composite abstract unifica- 
tion operator performing also the renaming of variables. It is our opinion that our 
main result, the soundness of the aunify operator, is really valuable as it allows for 
the safe application of sharing analysis based on Sharing to any constraint logic lan- 
guage supporting syntactic term structures, based on either finite trees or rational 
trees. This happens because our result docs not rely on the presence (or even the 
absence) of the occurs-check in the concrete unification procedure implemented by 
the analysed language. Furthermore, as the groundness domain Def is included in 
Sharing, our main soundness result also shows that Def is sound for non-idempotent 
substitutions. 

From a technical point of view, we have introduced a new class of concrete sub- 
stitutions based on the notion of variable-idempotence, generalizing the classical 
concept of idempotence. Wc have shown that any substitution is equivalent to a 
variable-idempotent one, providing a finite sequence of transformations for its con- 
struction. This result assumes an arbitrary equality theory and is therefore applica- 
ble to the study of any abstract property which is preserved by logical equivalence. 
Our application of this idea to the study of the soundness of abstract unification 
for Sharing has shown that it is particularly suitable for data-flow analyzers where 
the corresponding abstraction function only depends on the set of variables occur- 
ring in a term. However, wc believe that this concept can be usefully exploited 
in a more general context. Possible applications include the proofs of optimality 
and completeness of abstract operators with respect to the corresponding concrete 
operators defined on a domain of substitutions in rational solved form. 
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